AI Videos

All News Deals Social Blogs Videos Podcasts Digests

Stanford CS221 | Autumn 2025 | Lecture 14: Bayesian Networks and Learning

•March 9, 2026

Stanford Online

Stanford Online•Mar 9, 2026

Why It Matters

Learning Bayesian network parameters from data transforms abstract probabilistic models into actionable tools for AI, enabling accurate inference and scalable decision‑making across diverse industries.

Key Takeaways

•Bayesian networks define joint distributions via directed acyclic graphs.
•Parameter learning reduces to counting occurrences and normalizing frequencies.
•Conditional independence enables parallel inference and simplifies computations.
•Fully observed data allows straightforward maximum likelihood estimation for CPTs.
•V-structures require careful handling but follow same count‑normalize principle.

Summary

The lecture revisits Bayesian networks as a compact representation of joint probability distributions, built from a directed acyclic graph and local conditional probability tables. After a quick refresher using the classic burglary‑earthquake‑alarm example, the professor reviews exact and approximate inference methods—marginalization, rejection sampling, and Gibbs sampling—and introduces d‑separation rules that determine conditional independence. Key insights include how independence is read off the graph: a path is blocked when a node is conditioned on (or its descendant in a V‑structure), enabling parallel computation during inference. The instructor then shifts to learning: with fully observed data, maximum‑likelihood estimates of each conditional table are obtained by simply counting occurrences of each variable configuration and normalizing to sum to one. Illustrative examples progress from a single‑node network modeling movie ratings, to a two‑node network adding genre, and finally a three‑node network that incorporates awards. In each case, the learning algorithm iterates over the dataset, updates counts for the relevant parent‑child configurations, and normalizes to produce the conditional probability tables, demonstrating that even complex structures follow the same count‑and‑normalize pattern. The practical implication is that Bayesian networks become data‑driven models once their parameters are learned, allowing scalable probabilistic reasoning in real‑world domains such as recommendation systems, fault diagnosis, and causal analysis. Mastery of conditional independence and efficient parameter estimation is essential for deploying reliable AI systems that can reason under uncertainty.

Original Description

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai

To learn more about enrolling in this course, visit: https://online.stanford.edu/courses/cs221-artificial-intelligence-principles-and-techniques

Please follow along with the course schedule: https://stanford-cs221.github.io/autumn2025/

Follow the playlist: https://youtube.com/playlist?list=PLoROMvodv4rMeDqwS1yFl3j3sR_-MQNEN&si=bVivXjDfVEQKky1D

Teaching Team

Percy Liang, Associate Professor of Computer Science (and courtesy in Statistics)

Comments

Want to join the conversation?

Loading comments...