Stanford CS221 | Autumn 2025 | Lecture 13: Bayesian Networks and Gibbs Sampling

Stanford Online
Stanford OnlineMar 9, 2026

Why It Matters

Mastering Gibbs sampling equips data scientists and AI engineers with a scalable inference tool essential for building reliable probabilistic models that drive decision‑making in complex, uncertain environments.

Key Takeaways

  • Build Bayesian networks by defining variables, graph structure, and CPTs.
  • Exact inference via joint distribution quickly becomes intractable for many variables.
  • Rejection sampling approximates queries but suffers from low acceptance with rare evidence.
  • Gibbs sampling generates dependent samples that always satisfy evidence, improving efficiency.
  • Conditional independence underpins network factorization and guides efficient inference algorithms.

Summary

The lecture revisits Bayesian networks, emphasizing their construction—identifying variables, drawing directed graphs, and populating conditional probability tables (CPTs). It then shifts focus to probabilistic inference, contrasting exact tensor‑based computation with approximate sampling methods, and introduces Gibbs sampling as a faster alternative to rejection sampling.

Key insights include the factorization of the joint distribution as the product of local CPTs, illustrated with the classic burglary‑earthquake‑alarm example where P(B|A=1) equals 0.51. The instructor shows how exact inference requires enumerating all assignments, which explodes exponentially, motivating approximate techniques. Rejection sampling is explained step‑by‑step, highlighting its simplicity but also its inefficiency when evidence is rare, as demonstrated by a 300‑sample run yielding a rough 0.44 estimate.

The lecture then presents Gibbs sampling, a Markov chain Monte Carlo method that starts from a valid evidence‑consistent state and iteratively resamples each variable conditioned on the others. A telephone‑game network (A→B→C) illustrates how Gibbs sampling maintains evidence (C=1) while exploring the posterior over A, avoiding the costly rejections of the previous method. The discussion underscores the trade‑off: samples are correlated, yet the algorithm scales to high‑dimensional models.

Implications are clear: Gibbs sampling enables scalable inference for large Bayesian networks common in AI, probabilistic programming, and decision‑support systems. Understanding conditional independence and appropriate sampling strategies equips practitioners to balance accuracy, computational cost, and convergence guarantees in real‑world applications.

Original Description

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai
Please follow along with the course schedule: https://stanford-cs221.github.io/autumn2025/
Teaching Team
Percy Liang, Associate Professor of Computer Science (and courtesy in Statistics)

Comments

Want to join the conversation?

Loading comments...