OpenAI Is Asking Contractors to Upload Work From Past Jobs to Evaluate the Performance of AI Agents

•January 10, 2026

WIRED AI•Jan 10, 2026

Companies Mentioned

OpenAI

Anthropic

Google

GOOG

Mercor

Scale AI

Getty Images

GETY

Why It Matters

The program shows how AI labs are turning real enterprise output into training data, accelerating model capabilities but exposing firms to legal and ethical risks around confidential information.

Key Takeaways

•Contractors upload actual work files for AI performance testing
•OpenAI mandates scrubbing of confidential and personal data
•Legal risk: potential trade‑secret misappropriation claims
•Industry trend: paid contractor networks fuel AI training data
•Human baseline metrics guide AGI progress assessments

Pulse Analysis

OpenAI’s latest evaluation effort reflects a shift from synthetic benchmarks to real‑world task measurement. By collecting concrete deliverables—presentations, spreadsheets, code snippets—directly from professionals, the company can compare AI output against a human baseline across diverse industries. This granular data promises more accurate assessments of model competence, informing investors and regulators about progress toward artificial general intelligence. However, the reliance on authentic work introduces a complex layer of data governance. Contractors are instructed to remove proprietary details, and OpenAI even provides a "Superstar Scrubbing" tool to aid the process, yet the effectiveness of automated redaction remains uncertain.

The legal landscape surrounding this data pipeline is fraught with risk. Intellectual‑property lawyers warn that even heavily scrubbed documents may still contain trade secrets or confidential strategy, exposing contractors to breach of non‑disclosure agreements and AI labs to misappropriation lawsuits. The onus falls on contractors to judge what constitutes protected information, a judgment that courts may scrutinize. As AI systems increasingly ingest corporate artifacts, regulators may consider stricter oversight of data provenance, compelling firms to adopt more rigorous verification and audit mechanisms.

Beyond compliance, the contractor‑driven data model is reshaping the AI training economy. Companies like Handshake AI, Surge, and Scale AI have built multi‑billion‑dollar businesses supplying high‑quality, domain‑specific datasets to OpenAI, Anthropic, and Google. This burgeoning sub‑industry incentivizes the recruitment of skilled professionals capable of producing nuanced, task‑level outputs, driving up labor costs and creating a competitive market for data talent. As the race for superior enterprise‑grade AI intensifies, the balance between rapid model improvement and safeguarding corporate confidentiality will become a decisive factor in shaping the sector’s future.