Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 2 - Score Matching
Why It Matters
Score matching provides a tractable, stable way to train diffusion models, accelerating high‑fidelity generative AI development.
Key Takeaways
- •Score matching replaces noise prediction with gradient of log‑density.
- •Log‑density gradient (score) is tractable, directionally equivalent to density gradient.
- •Langevin sampling uses estimated scores to generate diverse samples.
- •Denoising score matching leverages Gaussian noise to learn scores.
- •Implicit and slice score matching are older, less common methods.
Summary
Lecture two of Stanford CME296 introduces score matching as the next‑generation framework for generative modeling, following the diffusion‑based DDPM approach covered previously. The professor revisits the goal of sampling from an unknown data distribution and contrasts the traditional reverse‑diffusion noise‑prediction strategy with a new perspective that leverages the gradient of the log‑density, known as the score.
The core insight is that while the raw probability gradient is intractable, its log‑gradient eliminates the normalizing constant and remains numerically stable, pointing in the same direction toward higher‑density regions. By estimating this score, one can employ Langevin dynamics—a stochastic MCMC technique—to move from easy‑to‑sample noise toward realistic data points while preserving diversity. The lecture explains how denoising score matching uses Gaussian‑perturbed training samples to learn the score analytically, sidestepping the need for the true data distribution.
Examples include a 1‑D Gaussian where the score simplifies to −(x–μ)/σ^2, illustrating why the vector points toward the mean. The instructor also references earlier methods such as implicit and slice score matching, noting they are largely superseded by denoising approaches due to practicality and performance.
For practitioners, mastering score matching unlocks more flexible generative pipelines, enabling high‑quality image synthesis without explicit likelihood computation. It also aligns with current research trends that blend diffusion models with score‑based sampling, shaping the next wave of AI‑driven content creation.
Comments
Want to join the conversation?
Loading comments...