What Happens to Software When Proof Is Cheap?

UW CSE (Allen School)
UW CSE (Allen School)Apr 17, 2026

Why It Matters

Cheaper proof generation could scale formal verification across the software stack, reducing costly bugs and ensuring the reliability of powerful AI applications.

Key Takeaways

  • AI achieved gold‑medal math scores using Lean formal proofs
  • Collaborative AIs solved an Erdős open problem within months
  • Cheap formal verification could eliminate entire classes of software bugs
  • Verifying AI code becomes critical as systems gain decision‑making power

Pulse Analysis

The 2025‑26 Allen School lecture highlighted a watershed moment: AI not only excelled at competitive mathematics but did so by constructing verifiable proofs in Lean, a proof assistant traditionally reserved for human mathematicians. This shift signals that the barrier between abstract theorem proving and practical software verification is eroding, as machine‑generated reasoning becomes both rapid and reliable. By automating the labor‑intensive steps of formal methods, AI can democratize access to rigorous correctness guarantees that were once limited to niche projects in cryptography or operating‑system kernels.

Formal verification has historically been hampered by the steep expertise required to encode system behavior into mathematical logic. Recent AI advances compress the cost of that expertise, allowing developers to feed code into Lean‑based pipelines and receive machine‑checked proofs of safety properties in minutes rather than months. The implications are profound: large‑scale codebases could be continuously verified as they evolve, reducing regression bugs, security vulnerabilities, and costly post‑release patches. Industries ranging from autonomous vehicles to cloud infrastructure stand to benefit from a new paradigm where correctness is baked into the development workflow rather than retrofitted.

Paradoxically, the very systems that make verification affordable are the ones that most urgently need it. As AI models grow in capability and opacity, they are entrusted with high‑stakes decisions in finance, healthcare, and national security. Ensuring that these models behave as intended—and that their underlying software cannot be subverted—requires the same formal guarantees now within reach. Companies like Galois, with deep expertise in formal methods and partnerships with DARPA and AWS, are positioned to translate academic breakthroughs into enterprise‑grade tooling, ushering in an era where cheap proof generation underpins trustworthy AI deployment.

Original Description

2025-26 Allen School Distinguished Lecture Series
Title: What Happens to Software When Proof is Cheap?
Speaker: Mike Dodds (Galois, Inc.)
Date: Thursday, April 16, 2026
Abstract: In July 2025, three AI systems independently achieved gold-medal standard at the International Math Olympiad. One of them, Harmonic's Aristotle, did it by constructing formal proofs in the Lean proof assistant. Six months later, several AIs working together used Lean to solve an open problem posed by Paul Erdős. We may soon live in a strange world where AI is better at math than any human expert.
Lean and tools like it bridge two worlds: mathematicians use them to formalise theorems, but engineers use them to prove that code behaves correctly. This second use, formal verification, has a long history and a few notable successes in cryptography, operating systems, and parser security. But these successes have always been limited by the sheer difficulty of the mathematical reasoning they require.
Now, AI may be changing this picture. If mathematical reasoning is cheap, we could eliminate entire classes of bugs from systems at scale, guarantee that safety-critical code behaves as intended, or verify auto-generated code as fast as it is written. Our need for rigorous verification is growing just as the cost of doing so may be dropping.
The most important software in need of verification may be AI systems themselves. These are growing more capable and more opaque, and we are granting them increasing power over consequential decisions. The same advances making mathematical proof cheaper may also be creating the systems that most urgently need to be proved safe.
Bio: Mike Dodds is a Principal Scientist at Galois, Inc., an employee-owned research company in Portland, OR. Galois builds formal methods and security technologies for clients including DARPA and AWS. Mike's work focuses on making formal verification practical: he led the verification of core cryptographic code in the AWS-LibCrypto library, built reference PDF parsers for the PDF Association, and developed tools for translating legacy C code into Rust. Before Galois, he held academic positions at the University of Cambridge and the University of York, where he worked on separation logic, concurrency, and hardware memory models. He holds a PhD from York.
This video is in the process of being closed captioned.

Comments

Want to join the conversation?

Loading comments...