
This Day in AI Podcast
OpenAI’s GPT‑5.2 hit the market with a 400 k token context window and a modest price bump to $1.75 per million input tokens. While the larger window and vision tool upgrades sound impressive, most users report that the core model feels unchanged, delivering overly verbose answers that add little value. The release feels like a tuning exercise rather than a breakthrough, and the higher cost raises questions about ROI for businesses seeking efficient, concise outputs.
The biggest pain point emerges in tool‑calling and agentic workflows. GPT‑5.2 often fails to chain multiple tool calls, stumbling on parallel requests and refusing to self‑correct when a call goes awry. In contrast, Anthropic’s Claude models operate with an internal “clock,” iterating over tasks and allowing users to intervene between rounds. This iterative design yields more reliable agentic behavior, especially for complex, multi‑step processes. OpenAI’s newer ethical guardrails also inject unnecessary disclaimer language, further slowing down interactions and limiting practical utility.
Competitive pressure is reshaping the landscape. Grok 4.1 offers a fraction of the cost—around $0.20 per million tokens—and delivers faster responses, making it attractive for time‑critical tool‑calling scenarios. Gemini’s integration into Google services is gaining user trust, while DeepSeek’s emergence broke the perception that ChatGPT is the sole AI option. These dynamics have inflicted brand damage on OpenAI and delayed the anticipated “year of agents.” As enterprises experiment with delegation versus iterative collaboration, the market will likely reward models that balance speed, cost, and reliable agentic capabilities.
Join Simtheory: https://simtheory.ai
GPT-5.2 is here and... it's not great. In this episode, we put OpenAI's latest model through its paces and discover it can't even identify a convicted serial killer when the text literally says "serial killer." We compare it head-to-head with Claude Opus and Gemini 3 Pro (spoiler: they win). Plus, we reflect on the "Year of Agents" that wasn't, why your barber switched to Grok, Disney's billion-dollar investment to use Mickey Mouse in Sora, and why Mustafa Suleyman should probably be fired. Also featuring: the GPT-5.2 diss track where the model brags about capabilities it doesn't have.
CHAPTERS:
00:00 Intro - GPT-5.2 Drops + Details
01:25 First Impressions: Verbose, Overhyped, Vibe-Tuned
02:52 OpenAI's Rushed Response to Gemini 3
03:24 Tool Calling Problems & Agentic Failures
04:14 Why Anthropic's Models Just Work Better
06:31 The Barber Test: Real Users Are Switching to Grok
10:00 The Ivan Milat Vision Test (Serial Killer Edition)
17:04 Year of Agents Retrospective: What Went Wrong
25:28 The Path to True Agentic Workflows
31:22 GPT-5.2 Diss Track (Yes, Really)
43:43 Why We're Still Optimistic About AI
50:29 Google Bringing Ads to Gemini in 2026
54:46 Disney Pays $1B to Use Mickey Mouse in Sora
56:57 LOL of the Week: Mustafa Suleyman's Sad Tweets
1:00:35 Outro & Full GPT-5.2 Diss Track
Thanks for listening. Like & Sub. xoxox
Comments
Want to join the conversation?
Loading comments...