OpenAI Tests if GPT-5 Can Automate Your Job - 4 Unexpected Findings

AI Explained
AI ExplainedSep 26, 2025

Why It Matters

The results show meaningful progress toward task-level parity, signaling potential productivity gains and disruption in specific white-collar workflows, but also underscore that current models are not yet poised to wholesale-automate occupations—affecting how firms, regulators and investors should plan for AI adoption.

Summary

OpenAI published a study comparing frontier language models to industry experts on realistic, digitally oriented tasks and found some models are approaching expert deliverable quality. Anthropic’s Claude Opus 4.1 outperformed OpenAI’s models and in many cases came close to human experts, while performance varied significantly by file type and sector (PDFs, PowerPoints and Excel tasks fared best). The study also found that sufficiently capable models—exemplified by GT5—can speed up expert workflows, but that weaker models do not provide review-time savings. Crucially, the paper focused only on predominantly digital tasks from high-GDP sectors and excluded many non-digital or peripheral duties, tempering claims of near-term job automation.

Original Description

An OpenAI report released in the last 24 hours is the best look we have as to whether 2025 AI can automate your job. I’ll go through 4 unexpected findings, from which model is best at what, to practical tips and massive caveats. Plus UFC robots, radiologist essay, don’t trust videos and the blockers to the singularity.
Chapters:
00:00 - Introduction
00:55 - OpenAI Report Summary
02:40 - Tipping Point Speed-up
04:11 - Better than Industry Experts?
06:33 - Big Caveat
11:10 - Karpathy and the Radiologist Analogy
13:30 - Outro

Comments

Want to join the conversation?

Loading comments...