
Trust But Canary: Configuration Safety at Scale
Meta’s Configurations team explained how the company safeguards massive configuration rollouts using canary and progressive deployment techniques. The discussion highlighted health‑check metrics and monitoring signals that detect regressions early, and an incident‑review culture that focuses on system improvement rather than blame. AI‑driven analytics now cut alert noise and accelerate root‑cause bisecting when issues arise. The episode is part of the Meta Tech Podcast, showcasing engineering practices at scale.

How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines
Meta built a pre‑compute engine of 50+ specialized AI agents that scanned its 4,100‑plus file, three‑repo data pipeline and produced 59 concise context files capturing tribal knowledge. This "compass" layer lifted AI coverage from roughly 5% to 100% of the...

KernelEvolve: How Meta’s Ranking Engineer Agent Optimizes AI Infrastructure
Meta unveiled KernelEvolve, an autonomous agent that automates low‑level kernel creation and tuning for its diverse AI accelerator fleet—including NVIDIA GPUs, AMD GPUs, custom MTIA silicon, and CPUs. By treating kernel optimization as a search problem, the system compresses weeks...

Meta Adaptive Ranking Model: Bending the Inference Scaling Curve to Serve LLM-Scale Models for Ads
Meta unveiled its Adaptive Ranking Model, a request‑centric inference stack that lets the company run LLM‑scale recommendation models for ads with sub‑second latency. The system combines inference‑efficient scaling, deep model‑hardware co‑design, and a multi‑card GPU serving layer to handle up...

Ranking Engineer Agent (REA): The Autonomous AI Agent Accelerating Meta’s Ads Ranking Innovation
Meta introduced the Ranking Engineer Agent (REA), an autonomous AI system that runs end‑to‑end machine‑learning experiments for ads ranking. REA generates hypotheses, launches training jobs, debugs failures, and iterates without continuous human oversight, using a hibernate‑and‑wake cycle for multi‑day workflows....

Patch Me If You Can: AI Codemods for Secure-by-Default Android Apps
Meta’s Product Security team unveiled a two‑pronged solution to harden Android apps at scale: secure‑by‑default frameworks that wrap risky OS APIs, and generative‑AI‑driven codemods that automatically migrate existing code to those frameworks. The AI system can propose, validate, and submit...

Investing in Infrastructure: Meta’s Renewed Commitment to Jemalloc
Meta announced a renewed focus on jemalloc, the high‑performance memory allocator that underpins its infrastructure. The company has unarchived the open‑source repository and outlined a roadmap to cut technical debt, modernize the codebase, and add features such as a stronger...