architecture (2) attention (2) decoding (2) deep-learning (1) essays (6) gpu (2) inference (2) infra (1) llm (8) machine-learning (1) moe (1) multimodal (1) notes (10) performance (3) post-training (1) rl (1) roofline (1) sft (1) statistics (1) tutorials (13) vision (1)