Apple’s ‘AI Can’t Reason’ Claim Seen By 13M+, What You Need to Know

AI Explained
AI ExplainedJun 12, 2025

Why It Matters

The debate matters because misreading such research can distort public and policy views on AI risk and capability; in practice, LLMs remain powerful when paired with tools and proper prompting, so businesses should focus on integration and guardrails rather than assuming outright incapacity.

Summary

A widely shared Apple paper arguing that large language models (LLMs) “don’t reason” sparked sensational headlines, but a close read shows its findings largely restate known limits: LLMs are probabilistic generators that struggle with exact, high-complexity computation and long multi-step tasks. The paper’s experiments—on puzzles like Tower of Hanoi, checkers and extended arithmetic—show performance drops as task complexity and token-length demands rise, issues exacerbated by evaluation choices and token limits. Crucially, the models perform far better when allowed to use tools or code and when chain-of-thought prompting is used, suggesting the failures reflect design and testing limitations rather than a fundamental inability to “reason.” The critique also notes the authors shifted tests midstream after initial comparisons didn’t support their narrative, weakening the paper’s though-provoking but overstated claims.

Original Description

What to make of those headlines that AI can’t reason, seen by tens of millions? I cover the Apple paper in layman’s terms, what it means and doesn’t mean, and what’s next.
Thanks to Storyblocks for sponsoring this video! Download unlimited stock media at one set price with Storyblocks: https://storyblocks.com/AIExplained
Plus o3-pro and whether it is my current most-recommended model.
Chapters:
00:00 - Introduction
00:57 - Viral Post + Headlines
01:42 - Apple Paper Analysis
08:34 - But they do Hallucinate
10:43 - Not Supercomputers
11:18 - o3 Pro and Recommendations

Comments

Want to join the conversation?

Loading comments...