📈 Data to Start Your Week: The AI Capacity Trap

📈 Data to Start Your Week: The AI Capacity Trap

Exponential View
Exponential View•Apr 6, 2026

Key Takeaways

  • •Token price drop fuels higher demand
  • •OpenAI token processing grew 2.5x in five months
  • •Google TPUs operating at full capacity despite age
  • •Anthropic revenue up, but token price falling faster
  • •Platforms tightening usage limits, squeezing users

Pulse Analysis

The AI compute surge mirrors the Jevons paradox: as processing becomes cheaper, consumption accelerates, eroding the cost advantage. OpenAI’s token throughput leapt from six to fifteen billion per minute, illustrating how price reductions can trigger exponential demand spikes. This dynamic forces providers to balance pricing strategies against capacity constraints, as marginal revenue per token shrinks while total volume explodes.

Hardware providers feel the pressure too. Google’s Tensor Processing Units, some seven‑year‑old models, are now operating at near‑full utilization, a testament to the relentless demand for compute power. The extended lifespan of older hardware underscores a supply bottleneck, prompting manufacturers to accelerate next‑gen chip rollouts while grappling with inventory and depreciation challenges.

For AI firms, the revenue model is shifting from premium pricing to volume‑driven growth. Anthropic’s rising top‑line masks a faster decline in token price, making profitability increasingly dependent on scale. Concurrently, platforms are imposing stricter usage caps, squeezing developers and enterprises that rely on generous allowances. This tightening could slow innovation unless alternative pricing or efficiency breakthroughs emerge, signaling a pivotal inflection point for the AI services market.

📈 Data to start your week: The AI capacity trap

Comments

Want to join the conversation?