Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 2 - Score Matching

Stanford Online
Stanford OnlineApr 14, 2026

Why It Matters

Score matching provides a tractable, stable way to train diffusion models, accelerating high‑fidelity generative AI development.

Key Takeaways

  • Score matching replaces noise prediction with gradient of log‑density.
  • Log‑density gradient (score) is tractable, directionally equivalent to density gradient.
  • Langevin sampling uses estimated scores to generate diverse samples.
  • Denoising score matching leverages Gaussian noise to learn scores.
  • Implicit and slice score matching are older, less common methods.

Summary

Lecture two of Stanford CME296 introduces score matching as the next‑generation framework for generative modeling, following the diffusion‑based DDPM approach covered previously. The professor revisits the goal of sampling from an unknown data distribution and contrasts the traditional reverse‑diffusion noise‑prediction strategy with a new perspective that leverages the gradient of the log‑density, known as the score.

The core insight is that while the raw probability gradient is intractable, its log‑gradient eliminates the normalizing constant and remains numerically stable, pointing in the same direction toward higher‑density regions. By estimating this score, one can employ Langevin dynamics—a stochastic MCMC technique—to move from easy‑to‑sample noise toward realistic data points while preserving diversity. The lecture explains how denoising score matching uses Gaussian‑perturbed training samples to learn the score analytically, sidestepping the need for the true data distribution.

Examples include a 1‑D Gaussian where the score simplifies to −(x–μ)/σ^2, illustrating why the vector points toward the mean. The instructor also references earlier methods such as implicit and slice score matching, noting they are largely superseded by denoising approaches due to practicality and performance.

For practitioners, mastering score matching unlocks more flexible generative pipelines, enabling high‑quality image synthesis without explicit likelihood computation. It also aligns with current research trends that blend diffusion models with score‑based sampling, shaping the next wave of AI‑driven content creation.

Original Description

To follow along with the course schedule and syllabus, visit: https://cme296.stanford.edu/syllabus/
Chapters:
00:00:00 Introduction
00:04:33 Motivation behind score matching
00:14:30 Lanvegin sampling
00:18:43 Score estimation
00:19:50 Implicit score matching, sliced score matching
00:20:56 Score of a Gaussian distribution
00:29:58 Denoising score matching
00:40:44 Limitations of DSM
00:49:49 Noise conditional score networks
00:52:49 Annealed Langevin dynamics
00:58:09 Parallel between DDPM and score
01:04:37 Continuous derivation
01:14:31 SDE formulation
01:22:54 Training
01:24:29 Reverse SDE
01:28:46 Inference with Euler-Maruyama
01:30:32 PF-ODE
01:41:24 DPM-Solver
For more information about Stanford’s graduate programs, visit: https://online.stanford.edu/graduate-education
Afshine Amidi is an Adjunct Lecturer at Stanford University.
Shervine Amidi is an Adjunct Lecturer at Stanford University.

Comments

Want to join the conversation?

Loading comments...