Google’s New Gemini Pro Model Has Record Benchmark Scores — Again

•February 20, 2026

TechCrunch AI•Feb 20, 2026

Why It Matters

Gemini 3.1 Pro reasserts Google’s competitive edge in enterprise AI, potentially driving higher adoption of its cloud AI services and shaping the standards for agentic workloads.

Key Takeaways

•Gemini 3.1 Pro tops APEX-Agents benchmark
•Record scores on Humanity’s Last Exam test
•Preview available now; full release imminent
•Improves multi-step reasoning for agentic tasks
•Intensifies competition with OpenAI, Anthropic

Pulse Analysis

Google’s Gemini 3.1 Pro marks a notable leap in large‑language model performance, delivering record‑breaking scores on independent evaluations such as Humanity’s Last Exam. These benchmarks assess not only raw language understanding but also the model’s ability to synthesize complex information, a metric increasingly prized by enterprises seeking reliable AI assistants. By surpassing its November‑released Gemini 3, the new version demonstrates accelerated progress in model scaling and fine‑tuning techniques, reinforcing Google’s reputation for cutting‑edge AI research.

The model’s prominence on Mercor’s APEX‑Agents leaderboard underscores its practical utility for agentic workflows. APEX measures how effectively AI agents complete multi‑step, knowledge‑intensive tasks—a core requirement for customer support bots, data analysis pipelines, and automated decision‑making systems. Gemini 3.1 Pro’s top placement signals that it can handle intricate reasoning chains with reduced hallucination, offering businesses a more trustworthy tool for mission‑critical operations. This capability aligns with the broader industry shift toward AI that can act autonomously rather than merely generate text.

In the broader AI model wars, Gemini 3.1 Pro raises the stakes for competitors like OpenAI and Anthropic, who have recently rolled out their own next‑gen models. Google’s integration of the model into its Cloud AI suite could attract enterprises looking for seamless deployment, robust security, and deep integration with existing Google services. As firms evaluate AI platforms, the benchmark advantage of Gemini 3.1 Pro may become a decisive factor, accelerating adoption cycles and prompting rivals to accelerate their own performance improvements. The coming months will reveal whether Google can translate these technical gains into sustained market share growth.

AI Pulse

Google’s New Gemini Pro Model Has Record Benchmark Scores — Again

Why It Matters

Key Takeaways

Pulse Analysis

Ask Pulse AI: