Administrative Law and AI’s Overconfidence

•March 23, 2026

The Regulatory Review (Penn)•Mar 23, 2026

Key Takeaways

•AI-generated answers may appear authoritative yet contain errors
•Courts require agency decisions to meet arbitrary‑and‑capricious standard
•Validated, narrow AI can assist routine administrative tasks
•General‑purpose AI lacks validated policy analysis for rulemaking
•Overreliance on AI risks arbitrary agency actions and legal challenges

Summary

The article warns that large language models like ChatGPT often deliver confident, plausible‑sounding answers that can be factually wrong, likening them to overconfident taxi drivers. It explains that under the Administrative Procedure Act, courts will reject agency actions that rely on such unvalidated AI output as arbitrary and capricious. While narrow, validated AI tools can aid routine tasks, general‑purpose AI cannot replace the rigorous evidence‑based analysis required for rulemaking. Agencies must still conduct thorough records and impact analyses before adopting policy recommendations, even when AI suggests them.

Pulse Analysis

The rise of conversational AI has transformed how federal agencies gather information, but confidence does not equal competence. Large language models generate text by predicting likely word sequences, often producing polished responses that mask underlying hallucinations or bias. When an agency cites such output without independent verification, it risks violating the Administrative Procedure Act’s arbitrary‑and‑capricious test, which demands a reasoned record, consideration of alternatives, and evidence‑based justification. Courts are unlikely to accept AI‑only rationales, especially for high‑stakes policy choices like environmental standards.

Nonetheless, AI is not uniformly prohibited in the regulatory process. Narrow, task‑specific tools—such as automated comment‑sorting algorithms, document‑drafting assistants, or data‑validation scripts—can streamline routine functions that traditionally consume junior staff time. When these tools are rigorously validated against benchmark datasets and their performance exceeds human equivalents, agencies can rely on them without breaching legal standards. The key is documentation: agencies must show that the algorithm performs as intended and that ultimate decision‑making authority remains with human officials.

The broader lesson for administrators is to treat general‑purpose AI as a supplemental aid rather than a decision‑maker. Any policy recommendation generated by a model like ChatGPT must be cross‑checked with scientific studies, stakeholder input, and cost‑benefit analyses before being incorporated into the rulemaking record. By embedding AI within a robust validation framework and preserving human judgment, agencies can harness efficiency gains while satisfying the APA’s demand for reasoned, evidence‑based action. This balanced approach mitigates legal risk and ensures that AI’s confidence does not eclipse the rigor required for sound public policy.