Syllabus

Date	Topic	Recommended Readings	Deadlines
Week 1 — Mon Aug 26	Lecture — Introduction, Language modeling [slides]
Week 1 — Wed Aug 28	Lecture — Transformers Recap [slides]	Transformer_Vaswani++2017 Annotated Transformer Understanding LSTMs (optional) BERT_Devlin++2018	Paper list out, Friday Aug 30th.
Week 2 — Mon Sep 2	Labor Day — No Class
Week 2 — Wed Sep 4	Lecture - Transformers Recap 2 [slides]	RoBERTa, ALBERT, ELECTRA Scaling Laws for Neural Language Models	Paper Selection Due
Week 3 — Mon Sep 9	Lecture - GPT3++ [slides]	BPE, Language Model Tokenizers Introduce Unfairness Between Languages GPT-2: Language Models are Unsupervised Multitask Learners GPT-3: Language Models are Few-Shot Learners Scaling Laws for Neural Language Models	Project Guidelines out Sep 10
Week 3 — Wed Sep 11	Lecture - Prompting, CoT [slides]	Demystifying Prompts in Language Models via Perplexity Estimation Calibrate Before Use: Improving Few-Shot Performance of Language Models Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Large Language Models are Zero-Shot Reasoners
Week 4 — Mon Sep 16	Lecture - Scaling, Instruction Tuning [slides]	Scaling Laws for Neural Language Models Training Compute-Optimal Large Language Models Multitask Prompted Training Enables Zero-Shot Task Generalization Scaling Instruction-Finetuned Language Models Alpaca, Self-Instruct: Aligning Language Models with Self-Generated Instructions
Week 4 — Wed Sep 18	Lecture - Instruct Tuning [slides]	How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources Transformer Math The Power of Scale for Parameter-Efficient Prompt Tuning LoRA: Low-Rank Adaptation of Large Language Models
Week 5 — Mon Sep 23	Student Presentations — The False Promise of Imitating Proprietary LLMs [slides]	Alpaca Self-Instruct: Aligning Language Models with Self-Generated Instructions	Project Proposal Due
Week 5 — Wed Sep 25	Student Presentations — LIMA [slides]	Dataset: https://huggingface.co/datasets/GAIR/lima Constitutional AI: Harmlessness from AI Feedback The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning
Week 6 — Mon Sep 30	Student presentations — ChatBotArena	Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference Human Feedback is not Gold Standard
Week 6 — Wed Oct 2	Length-controlled Alpaca Eval Review1 discussion — MixEval	Eval leaderboard: https://tatsu-lab.github.io/alpaca_eval/	Review 1 Due: Oct 1st 11.59 p.m.
Week 7 — Mon Oct 7	Tanya Travelling — No Lecture
Week 7 — Wed Oct 9	Student presentations — Autobencher	Benchbench, tinyBench Evaluation Examples Are Not Equally Informative: How Should That Change NLP Leaderboards?
Week 8 — Mon Oct 14	Indigenous Peoples' Day — No Class
Week 8 — Wed Oct 16	Student Presentations — Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?	How pre-trained models capture factual knowledge? How do language models acquire factual knowledge during pretraining?
Week 9 — Mon Oct 21	Student Presentations — Large Language Models Struggle to Learn Long-Tail Knowledge
Week 9 — Wed Oct 23	Student Presentations — FactScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation	WICE: Real-World Entailment for Claims in Wikipedia DYNAMICQA: Tracing Internal Knowledge Conflicts in Language Models Context versus Prior Knowledge in Language Models
Week 10 — Mon Oct 28	Lecture — Alignment [slides]	Proximal Policy Optimization Algorithms Learning to summarize from human feedback The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization A General Theoretical Paradigm to Understand Learning from Human Preferences
Week 10 — Wed Oct 30	Student Presentation: A Long Way to Go: Investigating Length Correlations in RLHF	Length Desensitization in Directed Preference Optimization Disentangling Length from Quality in Direct Preference Optimization Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking
Week 11 — Mon Nov 4	Student presentations Scaling Laws for Reward Model Overoptimization Review 2 discussion — Iterative Preference Optimization with the Pairwise Cringe Loss		Review2 Due: Nov 3 11.59pm
Week 11 — Wed Nov 6	Student presentations SimPO: Simple Preference Optimization with a Reference-Free Reward
Week 12 — Mon Nov 11	Student Presentations — LoRA: Low-Rank Adaptation of Large Language Models
Week 12 — Wed Nov 13	Student presentations — StreamingLLM		Check-in due: Nov 13, 11.59 p.m.
Week 13 — Mon Nov 18	Student presentations — Speculative Decoding
Week 13 — Wed Nov 20	Student Presentations — Medusa Decoding
Week 14 — Mon Nov 25	Student presentations — Generalization through Memorization: Nearest Neighbor Language Models	kNN-LM Does Not Improve Open-ended Text Generation
Week 14 — Wed Nov 27	Thanksgiving Break — No Class
Week 15 — Mon Dec 2	Lecture
Week 15 — Wed Dec 4	Project Presentations		Project Presentation
Week 16 — Mon Dec 9	Project Presentations		Project Report (Due Dec 16)