Distinguished Seminar in Optimization & Data: Santosh Vempala (Georgia Tech)
Why It Matters
If pre-training inherently promotes confident falsehoods and post-training fails to discourage them, LLMs will continue to produce misleading outputs with real-world risks for consumers and enterprises; fixing training and calibration is essential for safer, more reliable AI deployment.
Summary
In a seminar on why language models hallucinate, Santosh Vempala (with collaborators from OpenAI) argued that the standard pre-training objective—maximizing likelihood over a training distribution—mathematically encourages models to produce plausible but false outputs even when the training data itself is valid. He illustrated the problem with contemporary LLM behavior (confident but incorrect formal proofs) and framed hallucination as a labeling issue over the space of possible documents. Vempala also criticized current post-training practices, saying they treat “I don’t know” the same as a wrong answer and thus fail to sufficiently penalize confident errors. He recommended revising post-training penalties and calibration to reduce unjustified confidence and hallucinations.
Comments
Want to join the conversation?
Loading comments...