Distinguished Seminar in Optimization & Data: Santosh Vempala (Georgia Tech)

UW CSE (Allen School)
UW CSE (Allen School)May 13, 2026

Why It Matters

If pre-training inherently promotes confident falsehoods and post-training fails to discourage them, LLMs will continue to produce misleading outputs with real-world risks for consumers and enterprises; fixing training and calibration is essential for safer, more reliable AI deployment.

Summary

In a seminar on why language models hallucinate, Santosh Vempala (with collaborators from OpenAI) argued that the standard pre-training objective—maximizing likelihood over a training distribution—mathematically encourages models to produce plausible but false outputs even when the training data itself is valid. He illustrated the problem with contemporary LLM behavior (confident but incorrect formal proofs) and framed hallucination as a labeling issue over the space of possible documents. Vempala also criticized current post-training practices, saying they treat “I don’t know” the same as a wrong answer and thus fail to sufficiently penalize confident errors. He recommended revising post-training penalties and calibration to reduce unjustified confidence and hallucinations.

Original Description

Title: Why Language Models Halluncinate
Speaker: Santosh Vempala (Georgia Tech)
Date: May 11, 2026
Abstract: Large language models often guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty. Such "hallucinations" persist even in state-of-the-art systems. We analyze this phenomenon from a mathematical perspective and find that the statistical pressures of the next-word prediction training pipeline induce hallucinations and current evaluation procedures reward guessing over acknowledging uncertainty. We also propose "open rubric" evaluations with explicit error penalties, providing a practical path to reliable LLMS by aligning their incentives. The talk will be fact-based, and the speaker will readily admit ignorance. Joint work with Adam T. Kalai, Ofer Nachum and Eddie Zhang.
Bio: Santosh Vempala is Distinguished Professor of Computer Science and Frederick P. Storey II chair at Georgia Tech. He received his Ph.D. in Algorithms, Combinatorics, and Optimization from Carnegie Mellon University in 1997, advised by Avrim Blum. He is a Fellow of the AMS, Fellow of the ACM, Guggenheim Fellow, Sloan Fellow and Simons Investigator. He received the IEEE Machtey Award at FOCS 97, the SODA 2021 Best Paper Award and the 2024 Fulkerson Prize. Santosh Vempala is well known for his foundational contributions to algorithms for convex optimization and sampling.
This video is in the process of being closed captioned.

Comments

Want to join the conversation?

Loading comments...