AI Videos

All News Deals Social Blogs Videos Podcasts Digests

OpenAI Tests if GPT-5 Can Automate Your Job - 4 Unexpected Findings

•September 26, 2025

AI Explained

AI Explained•Sep 26, 2025

Why It Matters

The results show meaningful progress toward task-level parity, signaling potential productivity gains and disruption in specific white-collar workflows, but also underscore that current models are not yet poised to wholesale-automate occupations—affecting how firms, regulators and investors should plan for AI adoption.

Summary

OpenAI published a study comparing frontier language models to industry experts on realistic, digitally oriented tasks and found some models are approaching expert deliverable quality. Anthropic’s Claude Opus 4.1 outperformed OpenAI’s models and in many cases came close to human experts, while performance varied significantly by file type and sector (PDFs, PowerPoints and Excel tasks fared best). The study also found that sufficiently capable models—exemplified by GT5—can speed up expert workflows, but that weaker models do not provide review-time savings. Crucially, the paper focused only on predominantly digital tasks from high-GDP sectors and excluded many non-digital or peripheral duties, tempering claims of near-term job automation.

Original Description

An OpenAI report released in the last 24 hours is the best look we have as to whether 2025 AI can automate your job. I’ll go through 4 unexpected findings, from which model is best at what, to practical tips and massive caveats. Plus UFC robots, radiologist essay, don’t trust videos and the blockers to the singularity.

Gray Swan: https://app.grayswan.ai/ai-explained

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:

00:00 - Introduction

00:55 - OpenAI Report Summary

02:40 - Tipping Point Speed-up

04:11 - Better than Industry Experts?

06:33 - Big Caveat

11:10 - Karpathy and the Radiologist Analogy

13:30 - Outro

GDPval: https://cdn.openai.com/pdf/d5eb7428-c4e9-4a33-bd86-86dd4bcf12ce/GDPval.pdf

[GDP Impact: https://fred.stlouisfed.org/release/tables?rid=331&eid=211

Task List: https://www.onetonline.org/link/summary/11-9141.00

Summer Tweet: https://x.com/LHSummers/status/1971252567981146347

Emad: https://x.com/EMostaque/status/1971254153067593739

Robots: https://x.com/cixliv/status/1967663286679478759

Unitree G1: https://x.com/UnitreeRobotics/status/1970039940022239491

Don’t Trust Video: https://x.com/AISafetyMemes/status/1970453369446871420

AGI Tweet: https://x.com/hyhieu226/status/1968378785709133915

Blockers to the Singularity: https://www.patreon.com/posts/blockers-to-and-139264812

Framework: https://gemini.google.com/share/f4b9c85a6ae9

METR Study (Dev Slowdown): https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

Karpathy Tweet: https://x.com/karpathy/status/1971220449515516391

Radiology Essay: https://worksinprogress.co/issue/the-algorithm-will-see-you-now/

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/

Comments

Want to join the conversation?

Loading comments...