Linear Digressions

Creator

0 followers

Explorations of ML and data science through applied, often unusual, use cases (historic big data context).

Blog•Mar 30, 2026

Benchmarking AI Models

Benchmarking large language models remains a nuanced challenge, as highlighted by two leading tests: MMLU, a 14,000‑question multiple‑choice exam covering fields from medicine to philosophy, and SWE‑bench, which tasks models with fixing authentic GitHub issues. The post examines how these benchmarks expose pitfalls such as Goodhart’s Law, data contamination, and the use of canary strings to detect leakage. It argues that high scores do not automatically translate into genuine intelligence or utility. Understanding these dynamics is essential for evaluating AI progress.

By Linear Digressions

Blog•Mar 23, 2026

The Hot Mess of AI (Mis-)Alignment

Anthropic’s new safety paper reframes AI misalignment as a statistical bias‑variance problem rather than a classic paper‑clip maximizer scenario. The research shows that as model intelligence and task complexity rise, both systematic bias and stochastic variance increase, heightening alignment risk....

By Linear Digressions

Blog•Mar 15, 2026

The Bitter Lesson

The "Bitter Lesson" argues that raw scale—more data, compute, and larger models—consistently outperforms clever, hand‑crafted algorithms. Historically, breakthroughs from Deep Blue to AlexNet illustrate this pattern, and modern large language models reinforce it. AI developers spend months fine‑tuning prompts only to...

By Linear Digressions

Blog•Mar 9, 2026

From Atari to Chat GPT: How AI Learned to Follow Instructions

ChatGPT’s ability to follow instructions stems from a decade‑long research trajectory that began with reinforcement learning from human preferences. Early work such as Christiano et al. (2017) taught agents to play Atari and walk robots, laying the foundation for preference‑based...

By Linear Digressions

Linear Digressions

Benchmarking AI Models

The Hot Mess of AI (Mis-)Alignment

The Bitter Lesson

From Atari to Chat GPT: How AI Learned to Follow Instructions

Technology Pulse

Top Publishers

Top Creators

Top Companies

Top Investors

Linear Digressions

Benchmarking AI Models

The Hot Mess of AI (Mis-)Alignment

The Bitter Lesson

From Atari to Chat GPT: How AI Learned to Follow Instructions