Everything in Moderation

Publication

0 followers

Weekly on content moderation/online safety, platforms, and policy

News•Apr 21, 2026

Is This What Assessing Risk Actually Looks Like?

Anthropic has published a 58‑page Alignment Risk Update alongside its new Mythos model, detailing the probability of harmful autonomous actions within organizations. The document arrives after the Grok scandal, which triggered EU and UK investigations under the Digital Services Act and the Online Safety Act. Regulators are increasingly demanding pre‑emptive risk assessments, and Anthropic’s voluntary disclosure marks a rare example of a frontier AI lab openly outlining potential misalignment. The move positions Anthropic as a possible benchmark for future compliance in the rapidly evolving AI safety landscape.

By Everything in Moderation

News•Apr 17, 2026

An AI Model Most Dangerous, Europe’s Child Safety Muddle and Altman Fights Back

The European Commission rolled out an age‑verification app that lets users prove their age with passports, national IDs or trusted institutions, a step toward the EU’s broader child‑safety agenda. A security researcher quickly exposed vulnerabilities that could leak sensitive personal...

By Everything in Moderation

News•Apr 13, 2026

How to Operationalise Platform Policy at Scale

Alice Hunsberger’s third Trust & Safety Insider guide tackles the often‑overlooked challenge of operationalising platform policy at scale. While policy writing accounts for roughly 20% of the effort, the remaining 80% involves translating rules into actionable guidance for human reviewers,...

By Everything in Moderation

News•Mar 10, 2026

The Best T&S Insider Reads From the Last Two Years

Ben Whitelaw celebrates two years of the Trust & Safety Insider newsletter, highlighting five standout articles that trace the field’s evolution from generative AI breakthroughs to new European regulations. The pieces showcase shifting priorities toward prosocial design, global policy tensions,...

By Everything in Moderation

News•Feb 24, 2026

How to Write User-Facing Platform Policies

The article outlines best practices for crafting user‑facing platform policies, emphasizing clarity, legal alignment, and iterative improvement. It argues that transparent policies boost trust while reducing compliance risk for platforms. Practical guidance includes language simplicity, consistent structure, and real‑world examples....

By Everything in Moderation

Everything in Moderation

Is This What Assessing Risk *Actually* Looks Like?

An AI Model Most Dangerous, Europe’s Child Safety Muddle and Altman Fights Back

How to Operationalise Platform Policy at Scale

The Best T&S Insider Reads From the Last Two Years

How to Write User-Facing Platform Policies

Technology Pulse

Is This What Assessing Risk Actually Looks Like?