Open-Source AI Just Crushed One of the Hardest Math Exams

•December 16, 2025

0

Analytics Vidhya

Analytics Vidhya•Dec 16, 2025

Why It Matters

Normus‑1 demonstrates that open‑source AI can reach elite human reasoning ability, democratizing access to powerful mathematical problem‑solving tools and reshaping competitive dynamics in research and education.

Summary

Open‑source researchers at Noise announced that their new 30‑billion‑parameter model, Normus‑1, achieved an 87‑out of‑120 score on the 2025 Putnam Mathematical Competition – a result that places the system within elite human performance on one of the world’s toughest undergraduate math exams. The model was built specifically for deep mathematical reasoning and its code, including the orchestration framework, has been released publicly.

Normus‑1’s advantage stems from a two‑phase reasoning architecture. In the first phase, multiple AI “workers” independently generate solution attempts; in the second, each attempt undergoes self‑critique and a tournament‑style bracket selects the most robust answer. When the same benchmark was run on a larger 330‑billion‑parameter model lacking this workflow, it managed only 24 points, underscoring that the gains arise from the specialized training and reasoning pipeline rather than sheer model size.

The researchers highlighted that the entire stack – the 30‑B model, the multi‑worker self‑critique loop, and the tournament selector – is open‑source, allowing anyone to reproduce or extend the system. They pointed to specific problem types, such as combinatorial proofs and functional equations, where Normus‑1 consistently produced fine‑grained, step‑by‑step derivations that matched or exceeded human solutions.

The breakthrough signals a shift in the AI landscape: open‑source initiatives can now compete with proprietary, closed‑door systems on high‑level abstract reasoning tasks. This could accelerate the adoption of AI‑assisted research, tutoring, and complex engineering design, while also prompting a reevaluation of how academic competitions gauge human versus machine performance.

Original Description

Nomos-1, a 30B open-source reasoning model, just scored near-elite human level on the Putnam Math Competition.

0

Comments

Want to join the conversation?

Loading comments...