AI Videos

All Technology
AI
Autonomy
B2B Growth
Big Data
BioTech
ClimateTech
Consumer Tech
Crypto
Cybersecurity
DevOps
Digital Marketing
Ecommerce
EdTech
Enterprise
FinTech
GovTech
Hardware
HealthTech
HRTech
LegalTech
Nanotech
PropTech
Quantum
Robotics
SaaS
SpaceTech

All News Deals Social Blogs Videos Podcasts Digests

AI Pulse

EMAIL DIGESTS

Daily

Every morning

Weekly

Sunday recap

News Deals Social Blogs Videos Podcasts

AI VideosApple’s ‘AI Can’t Reason’ Claim Seen By 13M+, What You Need to Know

Apple’s ‘AI Can’t Reason’ Claim Seen By 13M+, What You Need to Know

•June 12, 2025

0

AI Explained

AI Explained•Jun 12, 2025

Why It Matters

The debate matters because misreading such research can distort public and policy views on AI risk and capability; in practice, LLMs remain powerful when paired with tools and proper prompting, so businesses should focus on integration and guardrails rather than assuming outright incapacity.

Summary

A widely shared Apple paper arguing that large language models (LLMs) “don’t reason” sparked sensational headlines, but a close read shows its findings largely restate known limits: LLMs are probabilistic generators that struggle with exact, high-complexity computation and long multi-step tasks. The paper’s experiments—on puzzles like Tower of Hanoi, checkers and extended arithmetic—show performance drops as task complexity and token-length demands rise, issues exacerbated by evaluation choices and token limits. Crucially, the models perform far better when allowed to use tools or code and when chain-of-thought prompting is used, suggesting the failures reflect design and testing limitations rather than a fundamental inability to “reason.” The critique also notes the authors shifted tests midstream after initial comparisons didn’t support their narrative, weakening the paper’s though-provoking but overstated claims.

Original Description

What to make of those headlines that AI can’t reason, seen by tens of millions? I cover the Apple paper in layman’s terms, what it means and doesn’t mean, and what’s next.

Thanks to Storyblocks for sponsoring this video! Download unlimited stock media at one set price with Storyblocks: https://storyblocks.com/AIExplained

Plus o3-pro and whether it is my current most-recommended model.

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:

00:00 - Introduction

00:57 - Viral Post + Headlines

01:42 - Apple Paper Analysis

08:34 - But they do Hallucinate

10:43 - Not Supercomputers

11:18 - o3 Pro and Recommendations

13.7M Tweet: https://x.com/RubenHssd/status/1931389580105925115

Apple Paper: https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf

Guardian Article: https://www.theguardian.com/technology/2025/jun/09/apple-artificial-intelligence-ai-study-collapse

Lisan al Gaib post: https://x.com/scaling01/status/1931854370716426246

Multiplication: https://x.com/yuntiandeng/status/1836114401213989366

The Illusion of the Illusion of Thinking: https://drive.google.com/file/d/1Zx9ikRj0Enc3SB4wA9HlYIlpmO_8QiUO/view

Marcus: https://www.theguardian.com/commentisfree/2025/jun/10/billion-dollar-ai-puzzle-break-down

Prof Rao: https://x.com/rao2z/status/1927707640223719631

AI Job Headlines: https://www.nytimes.com/2025/06/11/technology/ai-mechanize-jobs.html

https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropic

Sky News Story: https://news.sky.com/story/can-we-trust-chatgpt-despite-it-hallucinating-answers-13380975

Veo 3 Kalshi Ad: https://x.com/Kalshi/status/1932891608388681791

Altman Essay: https://blog.samaltman.com/

o3 Original benchmarks: https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b8b6c44-acd6-43b3-b5c6-1a1d5c6c25e4_2486x1388.png

https://pbs.twimg.com/media/GfQ0bfcXQAAQt13.jpg

Alpha Evolve Video: https://www.youtube.com/watch?v=RH4hAgvYSzg

https://simple-bench.com/

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/

0

Comments

Want to join the conversation?

Loading comments...