
Generating Metrics From Traces with Cardinality Control: A Closer Look at HyperLogLog in Tempo
Why It Matters
Accurate cardinality estimation prevents runaway metric costs and ensures observability pipelines remain scalable. The feature gives both self‑hosted and Grafana Cloud users actionable insight into trace‑derived metric usage.
Key Takeaways
- •Metrics-generator creates span-based RED metrics from traces.
- •Cardinality explosion can inflate metrics costs dramatically.
- •HyperLogLog reduces memory per tenant to ~5KB.
- •Sliding window sketches enable stale series subtraction.
- •New Grafana Cloud metrics forecast series demand before collection.
Pulse Analysis
Observability teams have long relied on RED metrics—request rate, error rate, and duration—to gauge service health, while tracing provides deep, request‑level context. Tempo’s metrics‑generator bridges the gap by converting spans into these key performance indicators, allowing organizations that instrument only with tracing to gain immediate metric visibility. The convenience comes with a hidden risk: each unique combination of span attributes spawns a new metric series, and uncontrolled growth can quickly inflate storage bills and strain monitoring back‑ends.
To tame this problem, Tempo 2.10 adopts HyperLogLog, a probabilistic data structure that estimates cardinality with constant memory. By deploying a sliding window of 5‑minute sketches, Tempo can both count new series and effectively discard stale ones without the overhead of storing every identifier. The chosen precision of 10 consumes roughly 1 KB per sketch, delivering a 3 % standard error—far lower than the megabytes required for exact counting. Real‑world deployments on Grafana Cloud confirm the estimator stays within this error band, offering a lightweight yet reliable view of active series.
The practical payoff is immediate. New metrics such as `tempo_metrics_generator_registry_active_series_demand_estimate` surface the gap between configured limits and actual demand, enabling operators to adjust quotas or apply collector filters before costs spiral. Whether running Tempo on‑prem or via Grafana Cloud, teams now have a clear signal to balance trace‑derived metric richness against budget constraints, reinforcing a cost‑effective, scalable observability strategy.
Comments
Want to join the conversation?
Loading comments...