AI Videos

All News Deals Social Blogs Videos Podcasts Digests

DeepMind’s New AI Found A Strange New Way To Think

•June 5, 2026

Two Minute Papers

Two Minute Papers•Jun 5, 2026

Why It Matters

This demonstrates that system design—the verification loop and judging infrastructure—can turn fallible large models into reliable tools for hard, decades-old mathematical problems, potentially accelerating automated theorem proving and lowering per-solution costs. It also shifts AI progress from solely model improvements to engineering better harnesses around models.

Summary

DeepMind’s new system, AlphaProof Nexus, attempted about 350 formalized Erdős problems and produced nine verified proofs, a 95.7% failure rate, at a cost of a few hundred dollars per solved problem. The team used Lean for formal verification and a novel tournament-style loop where multiple AI-generated candidate proofs are iteratively judged and refined by a cheaper verifier that ranks partial solutions until one passes the validator. The approach accepts unreliable base models that lie or fail repeatedly but extracts reliable proofs by tightening the surrounding orchestration and validation. The experiment focused on a subset of easier-to-formalize problems and still required large models, underscoring both promise and current limits.

Original Description

❤️ Check out Weights & Biases and sign up for a free demo here: https://wandb.me/papers

📝 The paper is available here:

https://github.com/google-deepmind/alphaproof-nexus-results

https://arxiv.org/html/2605.22763v1

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:

Adam Bridges, Benji Rabhan, B Shang, Cameron Navor, Charles Ian Norman Venn, Christian Ahlin, Eric T, Fred R, Gordon Child, Juan Benet, Michael Tedder, Owen Skarpness, Richard Sundvall, Ryan Stankye, Shawn Becker, Steef, Taras Bobrovytsky, Tazaur Sagenclaw, Tybie Fitzhugh, Ueli Gallizzi

My research: https://cg.tuwien.ac.at/~zsolnai/

Thumbnail design: https://felicia.hu

Comments

Want to join the conversation?

Loading comments...