Large-Scale Foundation Models: Theory and Practice

Generated from Google AI Studio

Class Overview

This course covers algorithms and architectures for large-scale foundation models, including self-attention mechanisms, approximate algorithms for long-context sequences, subquadratic-time attention models, mixture of experts, autoregressive and generative models (such as Diffusion Transformers), efficient decoding strategies, distributed training, parameter-efficient fine-tuning, and includes project-based presentations.

Important Notes

The class will be held in person.
We will use Classum and KLMS (Please visit Classum thru KLMS).
Lecture slides will be uploaded in KLMS.
To contact Instructor/TAs, please use ee595b-25fall@googlegroups.com instead of individual emails.
No Midterm/Final Exams
We will not answer emails sent to individuals.

Lectures

When: Mon/Wed, 14:30~16:00
Where: Kim Beang-Ho & Kim Sam-Youl ITC Building (N1) #111

Instructor

Insu Han (Email: insu.han@kaist.ac.kr, Office: N1 #914)
Office Hour: Wed 4-5 pm + by appointment

Teaching Assistants

Hyungjoo Kim (Email: hyungjoo_kim@kaist.ac.kr, Office: N1 #312)
Seongwoo Moon (Email: seongwoo.moon@kaist.ac.kr, Office: N1 #312)

Grading Policy

TBD

Tentative Schedule

This schedule is tentative and subject to change. Please check back often.

Week	Date	Lecture
1	9/1 (Mon)	Lecture 1: Course Overview
1	9/3 (Wed)	Lecture 2: Sequence Modeling with RNN, Transformer
2	9/8 (Mon)	Lecture 3: Training Language Models and Decoding
2	9/10 (Wed)	Lecture 4: Decoding Strategies and Speculative Decoding
3	9/15 (Mon)	Lecture 5: Modern Transformer Architecture
3	9/17 (Wed)	Lecture 6: Scaling Laws, Mixture of Experts
4	9/22 (Mon)	Lecture 7: Long Context in Foundation Models and Flash Attention
4	9/24 (Wed)	Lecture 8: Approximating Self-Attention
5	9/29 (Mon)	Lecture 9: Diffusion Models (Part 1)
5	10/1 (Wed)	Lecture 10: Diffusion Models (Part 2)
6	10/6 (Mon)	Holiday (Chuseok) - No Class
6	10/8 (Wed)	Holiday (Chuseok) - No Class
7	10/13 (Mon)	Lecture 11: Diffusion Models (Part 3)
7	10/15 (Wed)	Lecture 12: Video Generation with Diffusion Models
8		Midterm (No Exam)
9	10/27 (Mon)	Lecture 13: Distributed Training / Parallelism
9	10/29 (Wed)	Lecture 14: Parameter-efficient Fine Tuning
10	11/3 (Mon)	Lecture 15: Quantization / Low-Precision Training
10	11/5 (Wed)	Lecture 16: LLM Compression
11	11/10 (Mon)	Lecture 17: Direct Preference Optimization (DPO)
11	11/12 (Wed)	Lecture 18: Multimodal Foundation Models
12	11/17 (Mon)	Lecture 19: Text-to-Image/Video Generation
12	11/19 (Wed)	Lecture 20: State Space Models
13	11/24 (Mon)	Project Presentation 1
13	11/26 (Wed)	Project Presentation 2
14	12/1 (Mon)	Project Presentation 3
14	12/3 (Wed)	Project Presentation 4
15	12/8 (Mon)	Project Presentation 5
15	12/10 (Wed)	Project Presentation 6
16		Final (No Exam)

Class Policy

Students are encouraged to interact with classmates, as well as the professor and the TAs, to discuss course material and assignment problems. In all your writing, including homework, essays, reports, and exams, use your own words, and acknowledge the source if you use someone else’s slides, quotes, figures, text, etc. Plagiarism and cheating are serious offenses and will be punished by failure on assignments/course, and suspension or expulsion from the university.