
Is This What Assessing Risk *Actually* Looks Like?
Anthropic has published a 58‑page Alignment Risk Update alongside its new Mythos model, detailing the probability of harmful autonomous actions within organizations. The document arrives after the Grok scandal, which triggered EU and UK investigations under the Digital Services Act and the Online Safety Act. Regulators are increasingly demanding pre‑emptive risk assessments, and Anthropic’s voluntary disclosure marks a rare example of a frontier AI lab openly outlining potential misalignment. The move positions Anthropic as a possible benchmark for future compliance in the rapidly evolving AI safety landscape.

An AI Model Most Dangerous, Europe’s Child Safety Muddle and Altman Fights Back
The European Commission rolled out an age‑verification app that lets users prove their age with passports, national IDs or trusted institutions, a step toward the EU’s broader child‑safety agenda. A security researcher quickly exposed vulnerabilities that could leak sensitive personal...

How to Operationalise Platform Policy at Scale
Alice Hunsberger’s third Trust & Safety Insider guide tackles the often‑overlooked challenge of operationalising platform policy at scale. While policy writing accounts for roughly 20% of the effort, the remaining 80% involves translating rules into actionable guidance for human reviewers,...

The Best T&S Insider Reads From the Last Two Years
Ben Whitelaw celebrates two years of the Trust & Safety Insider newsletter, highlighting five standout articles that trace the field’s evolution from generative AI breakthroughs to new European regulations. The pieces showcase shifting priorities toward prosocial design, global policy tensions,...
How to Write User-Facing Platform Policies
The article outlines best practices for crafting user‑facing platform policies, emphasizing clarity, legal alignment, and iterative improvement. It argues that transparent policies boost trust while reducing compliance risk for platforms. Practical guidance includes language simplicity, consistent structure, and real‑world examples....