In-Context Learning vs Supervised Fine-Tuning with Sharon Zhou

O’Reilly Media
O’Reilly MediaMar 24, 2026

Why It Matters

Choosing the right approach directly impacts operational expenses, latency, and data security, influencing a company’s ability to scale AI services profitably.

Key Takeaways

  • In‑context prompting works for simple, low‑frequency tasks effectively
  • Supervised fine‑tuning yields higher accuracy on private models
  • Fine‑tuned small models offer low latency and cost savings
  • Success depends on expertise in model tuning and evaluation
  • Empirical testing is essential to decide between approaches

Summary

The discussion centers on the trade‑offs between in‑context learning—embedding examples directly in prompts—and supervised fine‑tuning, where a model is retrained on task‑specific data.

In‑context prompting is quick to implement and can be cost‑effective when API calls are infrequent and the context window is small. Fine‑tuning, however, often delivers higher accuracy, enables deployment of smaller private models, and reduces latency and recurring API fees when run on‑premise.

Sharon Zhou notes a split in practice: teams with tuning expertise see rapid ROI from custom models, while those lacking experience may struggle to achieve desired results. She cites the “Haiku” family of compact hosted models as an example of low‑cost, low‑latency options that still rely on in‑context usage.

The implication for businesses is clear: evaluate usage patterns, data‑privacy requirements, and internal skill sets, then run empirical tests to determine whether prompt engineering or model fine‑tuning delivers the optimal balance of performance and cost.

Original Description

In-context learning works by putting important information and examples into the prompt context itself, and it can be pretty effective for many use cases—but not all. In her recent conversation with Ben Lorica, AMD’s Sharon Zhou detailed the benefits and trade-offs of in-context learning and supervised fine-tuning, explaining when you may want to use one over the other. #shorts
Follow O'Reilly on:

Comments

Want to join the conversation?

Loading comments...