Jack Clark

Creator

0 followers

Glasgow Celtic

Import AI 446: Nuclear LLMs; China's Big AI Benchmark; Measurement and AI Policy

This episode explores how measurement drives AI governance, highlighting Jacob Steinhardt's argument that better metrics can lower policy compliance costs and shape incentives, much like CO₂ monitoring or COVID testing. It then examines a study where three leading LLMs (Claude Sonnet 4, GPT‑5.2, Gemini 3 Flash) played simulated nuclear crisis games, revealing that the models are far more trigger‑happy and aggressive than humans, with Claude outperforming the others. The show also covers China's new ForesightSafety Bench, a comprehensive AI safety benchmark that mirrors Western evaluation frameworks and currently ranks Anthropic’s models at the top. Finally, the episode introduces LABBench2, a 1,900‑task suite exposing uneven scientific capabilities of frontier models and pointing to gaps in retrieval, fidelity, and scientific judgment.

By Jack Clark

Social•Feb 21, 2026

Language Models Adopt Distinct Strategies in Simulated Nuclear Crises

Choose your fighter. From a paper I'm writing up for Import AI this week about the behavior of language models in a simulated nuclear crises. https://t.co/pwXdiITuYX

By Jack Clark

Social•Feb 19, 2026

Anthropic Expands Societal Impacts Team Amid Growing Model Influence

We’re aggressively scaling up the Societal Impacts (SI) team at Anthropic as our models are beginning to have non-trivial impacts on the world.

By Jack Clark

Blog•Dec 8, 2025

Import AI 437: Co-Improving AI; RL Dreams; AI Labels Might Be Annoying

Jack Clark discusses three timely AI topics: Facebook’s proposal for "co‑improving" AI, which advocates collaborative human‑AI research cycles to achieve safer superintelligence; the hidden costs and complexities of AI labeling policies, illustrated by EU compliance burdens that could hinder effective...

By Jack Clark

Deals•Aug 18, 2025

AI2 Secures $152M From NSF and NVIDIA to Build Open Multimodal AI Infrastructure

The Allen Institute for AI Research (AI2) received $152 million in combined funding—$75M from the National Science Foundation and $77M from NVIDIA—to support the Open Multimodal AI Infrastructure to Accelerate Science (OMAI) project, aiming to build a national-level open AI...

Jack Clark

Technology Pulse

Jack Clark

Recent Posts

Import AI 446: Nuclear LLMs; China's Big AI Benchmark; Measurement and AI Policy

Language Models Adopt Distinct Strategies in Simulated Nuclear Crises

Anthropic Expands Societal Impacts Team Amid Growing Model Influence

Import AI 437: Co-Improving AI; RL Dreams; AI Labels Might Be Annoying

AI2 Secures $152M From NSF and NVIDIA to Build Open Multimodal AI Infrastructure

Technology Pulse

Jack Clark

Recent Posts

Import AI 446: Nuclear LLMs; China's Big AI Benchmark; Measurement and AI Policy

Language Models Adopt Distinct Strategies in Simulated Nuclear Crises

Anthropic Expands Societal Impacts Team Amid Growing Model Influence

Import AI 437: Co-Improving AI; RL Dreams; AI Labels Might Be Annoying

AI2 Secures $152M From NSF and NVIDIA to Build Open Multimodal AI Infrastructure