He Thinks AI Code May Break Everything - EP 59 Will Wilson

Core Memory

He Thinks AI Code May Break Everything - EP 59 Will Wilson

Core Memory Mar 4, 2026

Why It Matters

As software underpins critical infrastructure, hidden bugs can cause massive outages and security risks. Wilson’s approach promises a more reliable way to catch elusive failures before they reach production, offering a path to safer, more dependable digital services. This episode is timely as AI‑enhanced development tools are reshaping how engineers build and test code.

Key Takeaways

  • Traditional tests miss unknown edge cases, causing software failures
  • Antithesis uses deterministic simulations to reproduce and fix bugs
  • Property‑based testing randomizes inputs, uncovering hidden defects
  • Deterministic hypervisor enables repeatable testing of non‑deterministic systems

Pulse Analysis

Software outages—from Verizon’s nationwide blackout to everyday login errors—show that modern code still crashes like a crime scene after the fact. Traditional example‑based tests ask engineers to anticipate every failure mode, then write handcrafted cases. In practice, unknown‑unknown bugs slip through, because software complexity rivals the largest machines ever built. Companies rely on observability fire alarms after the damage, but lack a reliable way to validate changes before release. This gap fuels costly downtime and erodes confidence in critical digital infrastructure for enterprises.

Antithesis tackles this problem with property‑based, autonomous testing that throws random inputs, network delays, and hardware faults at a deterministic simulation of the entire stack. By building a fully deterministic hypervisor, the team can replay any failure exactly, turning flaky, non‑deterministic bugs into reproducible scenarios. The simulation consumes standard infrastructure‑as‑code definitions—Kubernetes, Docker, Terraform—so even complex cloud deployments can be cloned instantly across all environments. This approach uncovers edge‑case crashes that no human could anticipate, dramatically reducing the need for post‑mortem firefighting.

The business impact is immediate: faster release cycles, lower outage risk, and measurable cost savings on debugging and observability tooling. Early adopters—from fintech firms to crypto exchanges—report that deterministic testing shortens incident resolution from days to minutes. As AI continues to generate code, the gap between writing and verifying it widens, making Antithesis’s simulation layer even more critical. Companies that embed deterministic testing into their CI/CD pipelines gain a competitive edge, turning software reliability from a liability into a strategic advantage in the long term.

Episode Description

Or why AI slop in a plane is a very bad thing

Show Notes

Comments

Want to join the conversation?

Loading comments...