Video•Feb 24, 2026
Braintrust's Ankur Goyal on Why Evals Are the Core of AI Development
In this interview, Ankur Goyal, founder and CEO of Braintrust, explains why evaluation frameworks—referred to as "evals"—are the cornerstone of modern AI product development. Drawing on his experience building Impira’s document‑extraction AI and leading Figma’s AI team, Goyal argues that the only controllable variable in a black‑box LLM is the definition of desired behavior, which is captured through rigorous evals.
Goyal highlights several practical insights: evals create durable engineering artifacts that survive model upgrades, they shift focus from endless prompt tweaking to systematic problem capture, and they enable a model‑agnostic development strategy that sidesteps the common analysis‑paralysis of choosing between OpenAI, Anthropic, or open‑source models. Braintrust’s platform embodies these principles, offering developer‑friendly tooling that scales across models and attracts high‑taste customers such as Stripe, Instacart, and Airtable.
He illustrates the approach with anecdotes—building a purpose‑built logging and query system to handle massive LLM‑generated text, and a culture where engineers drop sprint commitments to fix critical customer bugs instantly. This customer‑obsessed mindset, combined with a narrow focus on a few “tasteful” clients, has allowed Braintrust to deliver rapid, exponential growth for its early adopters.
The broader implication is clear: AI startups that embed eval‑centric workflows and adopt flexible, model‑agnostic platforms can accelerate product‑market fit, reduce technical debt, and maintain agility as the underlying models evolve. For investors and product teams, prioritizing evals and a customer‑first engineering culture is becoming a competitive differentiator in the fast‑moving AI landscape.