Gemini’s New Token Limits Are Just as Bad as Claude’s, and Maybe Even a Little Dumber

•May 21, 2026

MakeUseOf•May 21, 2026

Companies Mentioned

Google

GOOG

YouTube

OpenAI

Anthropic

Why It Matters

The tighter limits and perceived dip in model quality could drive enterprise and developer migration to rival platforms, reshaping the competitive landscape of generative AI services.

Key Takeaways

•AI Ultra price cut to $200/month, adds YouTube Premium and Gemini Spark.
•Compute‑based quotas refresh every five hours, limiting continuous sessions.
•Gemini 3.5 Flash shows higher hallucination rate than 3.1 Pro.
•Power users hit limits after ~40 minutes, prompting workarounds.

Pulse Analysis

Google’s latest Gemini subscription overhaul reflects a broader industry shift toward monetizing compute resources rather than simple query counts. By bundling premium Google services—YouTube Premium, a new image‑creation tool, and the Gemini Spark agent—into higher‑tier plans, the company aims to increase stickiness among business users. The price reduction for the Ultra tier to $200 per month, alongside a $100‑per‑month option, signals an attempt to retain power users while extracting more value from heavy compute consumption. This tiered strategy mirrors moves by competitors who are packaging AI capabilities with existing ecosystems to boost revenue.

The switch to compute‑based quotas, refreshing every five hours with a weekly ceiling, fundamentally changes how developers and knowledge workers interact with Gemini. Users now face abrupt pauses after roughly 40 minutes of intensive prompting, especially when leveraging high‑cost features like Omni video generation. This mirrors Anthropic’s Claude limits, prompting many to fragment sessions or seek alternative platforms. The tighter controls aim to curb runaway usage costs for Google, but they also risk alienating the very segment that fuels enterprise adoption—teams that rely on continuous, context‑rich interactions for research, coding, and content creation.

Compounding the usage friction is a reported dip in model reliability. Gemini 3.5 Flash appears to hallucinate more frequently and struggles with basic extraction tasks compared to the older 3.1 Pro. For organizations where factual accuracy is non‑negotiable, this regression nudges them toward OpenAI’s ChatGPT or Anthropic’s Claude, both of which maintain steadier performance metrics. The migration trend underscores a growing market expectation: AI providers must balance cost controls with consistent model quality, or risk losing premium customers to rivals offering more predictable outputs and flexible pricing structures.

Gemini’s new token limits are just as bad as Claude’s, and maybe even a little dumber

Read Original Article

Comments

Want to join the conversation?

Loading comments...