DeepSeek Is Back: China's AI Claims to Surpass ChatGPT and Gemini in Key Benchmarks

DeepSeek Is Back: China's AI Claims to Surpass ChatGPT and Gemini in Key Benchmarks

Mint – Technology (India)
Mint – Technology (India)Apr 24, 2026

Why It Matters

DeepSeek’s open‑source breakthrough narrows the performance gap with dominant U.S. models, offering enterprises a high‑capacity, cost‑efficient alternative for specialized coding and reasoning workloads. Its timing intensifies competitive pressure on incumbents and could reshape market dynamics in enterprise AI adoption.

Key Takeaways

  • DeepSeek‑V4‑Pro has 1.6 trillion parameters, Flash 284 billion
  • Both models support one‑million‑token context window (~750k words)
  • V4‑Pro‑Max leads coding benchmarks, scoring 90.2% on Apex Shortlist
  • Gemini still outperforms in factual QA, 75.6% vs 57.9% for DeepSeek
  • Launch follows OpenAI’s GPT‑5.5 release, intensifying AI competition

Pulse Analysis

DeepSeek’s V4 rollout marks a technical leap for Chinese AI firms, delivering models that combine massive scale with ultra‑long context handling. The Pro version’s 1.6 trillion parameters place it among the world’s largest language models, while the Flash variant offers a lighter footprint for cost‑sensitive deployments. By introducing three distinct reasoning modes—Non‑think, Think High, and Think Max—DeepSeek aims to let developers select the optimal balance of speed and depth for tasks ranging from routine email drafting to complex algorithmic problem solving.

Benchmark results underscore the model’s niche strengths. V4‑Pro‑Max topped the Apex Shortlist with a 90.2% score and earned a Codeforces rating of 3206, signaling elite competitive‑programming capability. It also tied for first on SWE Verified, a practical software‑engineering test. However, the model fell behind Gemini and GPT‑5.4 on knowledge‑heavy evaluations like MMLU‑Pro and SimpleQA‑Verified, highlighting a trade‑off between specialized coding prowess and broad factual recall. The efficiency claim—using roughly ten times less memory than V3.2 for long inputs—could make DeepSeek attractive for enterprises that need high‑throughput, low‑latency inference on massive documents.

The timing of DeepSeek’s announcement is strategic. OpenAI’s GPT‑5.5 launch, positioned as a direct response to Anthropic’s Claude, has reignited the arms race for foundation‑model supremacy. DeepSeek’s open‑source positioning offers a cost‑effective alternative for firms wary of vendor lock‑in, especially in regulated sectors where data sovereignty is paramount. As the AI market matures, the ability to deploy a trillion‑parameter model without the premium pricing of closed‑source offerings could shift procurement decisions, prompting larger players to accelerate innovation or adjust pricing to retain enterprise customers. The competitive pressure may ultimately drive faster convergence of performance across the global AI ecosystem.

DeepSeek is back: China's AI claims to surpass ChatGPT and Gemini in key benchmarks

Comments

Want to join the conversation?

Loading comments...