Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 1 - Diffusion

Stanford Online
Stanford OnlineApr 10, 2026

Why It Matters

Understanding diffusion models is essential for anyone aiming to develop or apply cutting‑edge generative AI, a technology reshaping industries from entertainment to design.

Key Takeaways

  • Course covers fundamentals of diffusion and large vision models.
  • Prerequisites include linear algebra, probability, differential equations, ML basics.
  • Lectures focus on generation paradigms, architectures, training, evaluation, conditioning.
  • Exams are pen‑and‑paper, testing intuition and core formulas.
  • Class emphasizes consistent notation and intuition over exhaustive math.

Summary

The video introduces Stanford’s CME296 course on diffusion and large vision models, taught by twin brothers with experience at Uber, Google, and Netflix. It outlines the class’s two main goals—understanding image‑generation paradigms and the training/evaluation of underlying models—while stressing the technical prerequisites needed.

Key points include a rigorous prerequisite list (linear algebra, probability theory, differential equations, basic ML), a logistics plan (Friday lectures, recorded videos, two pen‑and‑paper exams), and a teaching philosophy that balances mathematical rigor with intuition. The instructors promise consistent notation across papers and a focus on core formulas rather than exhaustive derivations.

A memorable analogy compares the diffusion process to sculpting from noisy rock, echoing Michelangelo’s view of art emerging from chaos. They also explain why generation starts from Gaussian noise: it is easy to sample, injects randomness for diverse outputs, and offers convenient mathematical properties.

The course equips students to enter the fast‑evolving generative‑AI field, whether in research or industry, by providing a solid conceptual foundation and practical skills for building and evaluating state‑of‑the‑art image models.

Original Description

To follow along with the course schedule and syllabus, visit: https://cme296.stanford.edu/syllabus/
Chapters:
00:00:00 Introduction
00:06:04 Class logistics
00:12:23 Outline of the class
00:17:47 Motivating example
00:23:31 Intuition behind diffusion
00:26:33 Image representation
00:46:29 Variational formulation
00:49:01 Joint probability distribution
00:55:58 Strategy to derive a tractable loss
00:57:39 ELBO derivation
01:05:34 KL divergence
01:10:10 Bayes' rule
01:27:18 DDPM training
01:28:11 Inference with DDPM
01:29:49 Faster sampling with DDIM
For more information about Stanford’s graduate programs, visit: https://online.stanford.edu/graduate-education
Afshine Amidi is an Adjunct Lecturer at Stanford University.
Shervine Amidi is an Adjunct Lecturer at Stanford University.

Comments

Want to join the conversation?

Loading comments...