AI Videos
  • All Technology
  • AI
  • Autonomy
  • B2B Growth
  • Big Data
  • BioTech
  • ClimateTech
  • Consumer Tech
  • Crypto
  • Cybersecurity
  • DevOps
  • Digital Marketing
  • Ecommerce
  • EdTech
  • Enterprise
  • FinTech
  • GovTech
  • Hardware
  • HealthTech
  • HRTech
  • LegalTech
  • Nanotech
  • PropTech
  • Quantum
  • Robotics
  • SaaS
  • SpaceTech
AllNewsDealsSocialBlogsVideosPodcastsDigests

AI Pulse

EMAIL DIGESTS

Daily

Every morning

Weekly

Sunday recap

NewsDealsSocialBlogsVideosPodcasts
AIVideosBraintrust's Ankur Goyal on Why Evals Are the Core of AI Development
CEO PulseCTO PulseAIEntrepreneurship

Braintrust's Ankur Goyal on Why Evals Are the Core of AI Development

•February 24, 2026
0
Greylock
Greylock•Feb 24, 2026

Why It Matters

Treating evals as the core engineering discipline gives AI products durability and flexibility, enabling startups to iterate quickly, satisfy demanding customers, and stay ahead of rapid model changes.

Key Takeaways

  • •Evals are the central engineering discipline for AI product success.
  • •Flexible model-agnostic platforms reduce analysis paralysis in AI development.
  • •Targeting high‑taste customers drives focused product‑market fit significantly.
  • •Customer‑obsessed culture prioritizes immediate bug fixes over sprint plans.
  • •Purpose‑built data infrastructure handles massive LLM‑generated text efficiently.

Summary

In this interview, Ankur Goyal, founder and CEO of Braintrust, explains why evaluation frameworks—referred to as "evals"—are the cornerstone of modern AI product development. Drawing on his experience building Impira’s document‑extraction AI and leading Figma’s AI team, Goyal argues that the only controllable variable in a black‑box LLM is the definition of desired behavior, which is captured through rigorous evals.

Goyal highlights several practical insights: evals create durable engineering artifacts that survive model upgrades, they shift focus from endless prompt tweaking to systematic problem capture, and they enable a model‑agnostic development strategy that sidesteps the common analysis‑paralysis of choosing between OpenAI, Anthropic, or open‑source models. Braintrust’s platform embodies these principles, offering developer‑friendly tooling that scales across models and attracts high‑taste customers such as Stripe, Instacart, and Airtable.

He illustrates the approach with anecdotes—building a purpose‑built logging and query system to handle massive LLM‑generated text, and a culture where engineers drop sprint commitments to fix critical customer bugs instantly. This customer‑obsessed mindset, combined with a narrow focus on a few “tasteful” clients, has allowed Braintrust to deliver rapid, exponential growth for its early adopters.

The broader implication is clear: AI startups that embed eval‑centric workflows and adopt flexible, model‑agnostic platforms can accelerate product‑market fit, reduce technical debt, and maintain agility as the underlying models evolve. For investors and product teams, prioritizing evals and a customer‑first engineering culture is becoming a competitive differentiator in the fast‑moving AI landscape.

Original Description

Greylock Partner Saam Motamedi sits down with Ankur Goyal to discuss his latest company Braintrust, which helps developers build AI that works.
Ankur is a repeat founder who started working in AI long before the debut of ChatGPT. He previously founded Impira, which was acquired by Figma, where he stayed on to lead the AI team.
The discussion covers:
- Why evals are core to AI development
- "Taste" in product development
- What a culture of customer obsession looks like
- Ankur's approach to scaling and managing a team
Timestamps:
00:00 - Intro
00:36 -  When Saam and Ankur met
01:20 -  Ankur's background (CS at CMU, MemSQL, Impira, Figma, Braintrust)
04:44 -  Why evals are the core of product development in AI
06:17 -  Why picking a development strategy is more important than picking a model
07:17 - Why "high taste" organizations use Braintrust
09:25 - Why the Braintrust platform is built on user feedback
10:16 - What being "customer obsessed" actually means
11:55 - Brainstore and database  systems
14:15 - Whether Datadog is a good analogy for Braintrust
16:00 - Whether there will be a company for each type of eval
17:46 -  Velocity of customer adoption at Braintrust
19:42 - Non engineers using Braintrust
21:56 - Ankur's style of leadership
22:42 -  Lessons on scaling GTM
24:36 - Why Braintrust takes recruiting as seriously as customer obsession
25:10 - The future for Braintrust
More on Ankur:
https://www.braintrust.dev/
https://www.linkedin.com/in/ankrgyl/
https://x.com/ankrgyl
More on Saam:
https://greylock.com/
https://www.linkedin.com/in/saammotamedi/
https://x.com/saammotamedi
0

Comments

Want to join the conversation?

Loading comments...