Big Data Podcasts
  • All Technology
  • AI
  • Autonomy
  • B2B Growth
  • Big Data
  • BioTech
  • ClimateTech
  • Consumer Tech
  • Crypto
  • Cybersecurity
  • DevOps
  • Digital Marketing
  • Ecommerce
  • EdTech
  • Enterprise
  • FinTech
  • GovTech
  • Hardware
  • HealthTech
  • HRTech
  • LegalTech
  • Nanotech
  • PropTech
  • Quantum
  • Robotics
  • SaaS
  • SpaceTech
AllNewsDealsSocialBlogsVideosPodcastsDigests

Big Data Pulse

EMAIL DIGESTS

Daily

Every morning

Weekly

Sunday recap

NewsDealsSocialBlogsVideosPodcasts
Big DataPodcasts#290: Always Be Learning
#290: Always Be Learning
Big Data

Digital Analytics Power Hour

#290: Always Be Learning

Digital Analytics Power Hour
•February 3, 2026•1h 6m
0
Digital Analytics Power Hour•Feb 3, 2026

Why It Matters

Understanding learning rate helps organizations turn every experiment into actionable knowledge, not just celebrate obvious wins. This broader view accelerates product safety and innovation, making data‑driven decisions more robust and reducing costly missteps—an especially timely insight as more companies scale their experimentation programs.

Key Takeaways

  • •Learning rate expands experiment success beyond simple win counts
  • •Neutral experiments require proper power analysis to count as learning
  • •Experimentation helps detect regressions, acting as a safety net
  • •Distribution of win, regression, neutral outcomes guides product strategy
  • •Multi‑metric decision rules balance success goals with guardrail metrics

Pulse Analysis

In this episode of the Analytics Power Hour, Tim Wilson and guest Martin Schultzberg unpack Spotify’s shift from a narrow win‑rate focus to a broader "learning rate" framework. They argue that counting only experiments that produce a clear winner masks two critical outcomes: early detection of regressions and well‑designed tests that yield no statistically significant effect. By redefining success to include safety‑net wins and powered neutral results, organizations can capture a fuller picture of what their experiments teach, turning data into a continuous learning engine rather than a simple pass‑fail ledger.

Spotify’s implementation breaks learning into three categories: obvious wins, regression detections, and neutral experiments that meet pre‑specified power thresholds. The team stresses rigorous sample‑size calculations and ongoing power monitoring to ensure neutral tests truly reflect a lack of effect rather than insufficient data. They also track the distribution of these outcomes, using it as a strategic signal—high regression catches indicate a strong safety net, while a surge of neutral results may signal diminishing returns or the need to adjust product focus. This nuanced view helps product teams allocate experimentation bandwidth efficiently and avoid wasted effort.

Beyond single‑metric analysis, Spotify adopts a multi‑metric decision framework that separates success metrics from guardrail metrics. At least one success metric must improve while no guardrails degrade, allowing teams to innovate without harming core experiences like podcast consumption when optimizing music recommendations. This balanced approach, coupled with a culture that encourages questioning and iteration, offers a roadmap for any data‑driven organization seeking to mature its experimentation practice and turn every test into actionable insight.

Episode Description

From a professional development perspective, you should always be learning: listening to podcasts, reading books, connecting with internal colleagues, following useful people on Medium and LinkedIn, and so on. Did we mention listening to podcasts? Well, THIS episode of THIS podcast is not really about that kind of learning. It's more about the sort of organizational learning that experimentation and analytics is supposed to deliver. How does a brand stay ahead of their competitors? One surefire way is to get smarter about their customers at a faster rate than their competitors do. But what does that even mean? Is it a learning to discover that the MVP of a hot new feature…doesn't look to be moving the needle at all? Our guest, Mårten Schultzberg from Spotify, makes a compelling case that it is! And the co-hosts agree. But it's tricky.

For complete show notes, including links to items mentioned in this episode and a transcript of the show, visit the show page.

Show Notes

#290: Always Be Learning – The Analytics Power Hour: Data and Analytics Podcast

Published: February 3, 2026

From a professional development perspective, you should always be learning: listening to podcasts, reading books, connecting with internal colleagues, following useful people on Medium and LinkedIn, and so on. Did we mention listening to podcasts? Well, THIS episode of THIS podcast is not really about that kind of learning. It’s more about the sort of organizational learning that experimentation and analytics is supposed to deliver. How does a brand stay ahead of its competitors? One surefire way is to get smarter about its customers at a faster rate than its competitors do.

But what does that even mean? Is it a learning to discover that the MVP of a hot new feature…doesn’t look to be moving the needle at all? Our guest, Mårten Schultzberg from Spotify, makes a compelling case that it is! And the co‑hosts agree. But it’s tricky.


Links to Resources Mentioned in the Show

  • Article: Beyond Winning: Spotify’s Experiments with Learning Framework

  • Article: Two Questions Every Experiment Should Answer

  • Platform: Confidence by Spotify

  • Article: Choosing a Sequential Testing Framework — Comparisons and Discussions

  • Article: Bringing Sequential Testing to Experiments with Longitudinal Data (Part 1): The Peeking Problem 2.0

  • Article: Bringing Sequential Testing to Experiments with Longitudinal Data (Part 2): Sequential Testing

  • Article: Risk‑Aware Product Decisions in A/B Tests with Multiple Metrics

  • YouTube Channel: 3blue1brown by Grant Sanderson

  • Article: Escaping the AI sludge… why MVPs should be delightful

  • Conference: DataTune in Nashville – March 6‑7, 2026

  • Conference: Marketing Analytics Summit in Santa Barbara – April 28‑29

  • Article: The next data bottleneck by Katie Bauer

Photo by Jason Dent on Unsplash


Episode Transcript

00:00:05.75 [Announcer]: Welcome to the Analytics Power Hour. Analytics topics covered conversationally and sometimes with explicit language.

00:00:15.90 [Tim Wilson]: Hi, everyone. Welcome to the Analytics Power Hour. This is episode 290. I’m Tim Wilson, and I’m joined for this episode by Val Kroll. How’s it going, Val?

00:00:25.38 [Val Kroll]: Fantastic. Excited for today.

00:00:28.80 [Tim Wilson]: Outstanding. Unfortunately, we were supposed to also be joined by Michael Helbling for this show, but he’s gone all on brand for the winner and gotten the flu. Luckily, as we’re into our 11th year of doing this show now, we’ve learned a thing or two about rolling with the punches. And as it turns out, learning is the topic for today’s show. I mean, it’s implicit in all forms of working with data. We’re looking at analysis or research or experimentation results and hoping, just hoping that we come out of the experience with a deeper knowledge of something. I mean, and hopefully it’s something useful, more knowledge than we had before. It’s a simple idea. Sometimes though, it’s a little harder to execute in practice. That’s why we perked up when we came across an article from some folks at Spotify called Beyond Winning, Spotify’s experiments with learning framework. We’re excited to welcome one of the co‑authors of that piece to today’s show. Mårten Schultzberg is a product manager and staff data scientist at Spotify. He has a deep background in experimentation and statistics, including actually teaching advanced statistics in a prior role for a number of years. So who better to chat with about learning? Welcome to the show, Mårten.

00:01:59.34 [Val Kroll]: We definitely fought over who got to be on this one.

00:02:05.69 [Tim Wilson]: Mårten, in the article that I referenced in the opening, which we’re definitely going to link to in the show notes, it’s a great read. You and your co‑authors make the distinction between a win rate and a learning rate for experimentation. That’s the premise of the article – this win rate, this learning rate as a proposed metric that’s actually in use. That seems like a good place to start. Maybe you can explain what you were seeing as a drawback to too much focus on win rate as a metric for experimentation programs?

00:02:43.90 [Mårten Schultzberg]: Yes. I think it needs to take a little step back. It started when we rolled experimentation out at Spotify properly, at scale in 2019‑2020. We quickly realized that one of the biggest wins we made over and over again was to detect bad things early and avoid them – a sort of “dodge‑bullets” mechanism. That’s why we run so many experiments: to avoid shipping bad things unintentionally, side‑effects, etc.

I’ve also seen a lot of blog posts and papers about win rates from other companies – the rate of experiments where you find a variant that is better than the previous variant and you ship it, a clear winner. I felt that this focus under‑celebrates all the other types of wins you can make besides finding something that’s better than the current version. It also doesn’t really reflect how most companies, at least the ones I’m familiar with, actually use experimentation. They use it partly to optimize things – to find winners and continuously improve. But that’s only one part of the puzzle. The other part – using it as a safety net – wasn’t talked about enough. That’s where this idea sprung from.

00:04:22.92 [Val Kroll]: I love that. One thing, though, is that a metric like learning rate seems squishy. Win rate is objective – we can tally it in a column and calculate a percentage. Can you talk a little about how you thought about the criteria for determining “we learned something” from an experiment?

00:04:56.01 [Mårten Schultzberg]: First, this was a team effort. It was driven by the central experimentation team at Spotify, but many other data scientists doing product work were involved. We had a lot of good discussions about what learning means and when you actually get value from an experiment.

We see three ways you can learn from a test:

  1. Obvious winner – what others call win rate: you find a version that is better than the current version.

  2. Obvious loss/regression – you detect something bad (e.g., latency spikes, crashes). Avoiding a worse version is also a win.

  3. Neutral experiment – you run a well‑planned experiment, do a proper power analysis, and find no effect. Because the experiment was powered, the null result is informative: you can confidently say the change had no meaningful impact.

The neutral case is more nuanced and requires checking that the planned sample size was met, etc. Our tooling makes it fairly easy, but it does take some thinking to get right.

00:07:27.62 [Val Kroll]: I’m literally writing those down because there are so many things I want to dig into. Before we go to the 5,000‑foot view, I’m curious about the culture change internally. With so many people having access to run experiments, what was it like to shift away from win rate to this new metric? Was there resistance, excitement, or questioning?

00:08:01.99 [Mårten Schultzberg]: There’s always people questioning everything at Spotify – that’s one of the things I love about the company. Because we realized early that experimentation was a powerful tool to avoid mistakes, the definition of learning already incorporated that safety aspect. Over time, people came to appreciate that avoiding something bad is a great learning and very valuable for product development.

The neutral case is trickier; there’s a lot of discussion about how strict the definition should be (exactly powered vs. some wiggle room). We were eager to publish a clear definition and hoped other companies would adopt it, which is why we’re on this podcast. I’m not convinced our definition is the ultimate one, but it’s a good first step away from the naive “only wins count” view.

00:09:53.50 [Tim Wilson]: The raging cynic in me wonders if people will game the metric by running inconsequential small tests. The analyst in me thinks that happens a lot with analytics – you dig in, try to find a relationship, and don’t see it. That can be unsatisfying. How do you think about neutral experiments so they’re not just AA tests that give a false sense of learning?

00:11:01.44 [Mårten Schultzberg]: Great question. We’ve been thinking about what a healthy distribution of experiment types looks like – wins, regressions, and neutral experiments. The ideal mix depends on product strategy.

If a company is early‑stage with a lot to gain and little to lose, it can afford a higher rate of exploratory tests. If a product is mature, the goals shift. The key is not to look at learning rate in isolation. We want a reasonable learning rate and a high win rate. If we have a high learning rate, we know we’re not wasting experimentation effort. If we’re running many under‑powered or neutral experiments, we won’t be able to claim anything didn’t have an effect.


End of transcript.

0

Comments

Want to join the conversation?

Loading comments...