Note: Since this is the first time the class is being taught, the schedule may adjust if we need more or less time on certain topics.
| 8/26 | Introduction [ slides ] | | |
| 8/28 | Language Modeling [ slides ] | | |
| 9/2 | Neural networks [ slides ] | | |
| 9/4 | Backpropagation [ slides ] | | |
| 9/9 | Embeddings [ slides ] | | |
| 9/11 | Transformers [ slides ] | | |
| 9/16 | Transformers (cont.) [ slides ] | | |
| 9/18 | Pretraining scaling [ slides ] | | |
| 9/23 | Multimodal models [ slides ] | | |
| 9/25 | No classes (Tu is OOO) |
| 9/30 | Prompting [ slides ] | | |
| 10/2 | Decoding strategies [ slides ] | | |
| 10/7 | Instruction tuning [ slides ] | | |
| 10/9 | Alignment [ slides ] | | |
| 10/14 | Large reasoning models & Test-time scaling [ slides ] | | |
| 10/16 | Large reasoning models & Test-time scaling (cont'd) [ slides ] | | |
| 10/21 | Evaluation [ slides ] | | |
| 10/23 | Mixture-of-Experts [ slides ] | | |
| 10/28 | No classes (Tu is OOO) |
| 10/30 | Efficient attention [ slides ] | | |
| 11/4 | Parameter-efficient fine-tuning [ slides ] | | |
| 11/6 | Distillation, quantization, and pruning [ slides ] | | |
| 11/11 | Retrieval-augmented generation / Tool-use models [ slides ] | | |
| 11/13 | LLM Agents / Mixture-of-Agents [ slides ] | | |
| 11/18 | TBD [ slides ] | | |
| 11/20 | TBD [ slides ] | | |
| 11/25 | No classes (Thanksgiving break) |
| 11/27 | No classes (Thanksgiving break) |
| 12/2 | TBD [ slides ] | | |
| 12/4 | Diffusion models [ slides ] | | |
| 12/9 | Ethics and safety [ slides ] | | |