Inside the LLM Call: GenAI Observability with OpenTelemetry

•May 14, 2026

OpenTelemetry Blog•May 14, 2026

Companies Mentioned

OpenAI

Microsoft

MSFT

GitHub

Docker

Why It Matters

Without observability, teams guess why AI agents lag or overspend; standardized GenAI telemetry provides the data needed to optimize performance, reliability, and budget at scale.

Key Takeaways

•OpenTelemetry adds standardized traces, metrics, events for GenAI workloads
•VS Code Copilot can emit full prompt content when enabled
•Aspire Dashboard visualizes GenAI spans, token usage, and latency histograms
•Token and latency metrics help control LLM costs and performance
•Community can influence GenAI semantic conventions via SIG feedback

Pulse Analysis

Generative AI applications are increasingly woven into enterprise software, yet their internal mechanics—model selection, token flow, and tool invocations—remain hidden. OpenTelemetry’s GenAI semantic conventions fill that gap by defining a common schema for traces, metrics, and events. This uniformity lets developers instrument any LLM‑powered service, from code assistants to customer‑service bots, without reinventing custom logging pipelines. The result is a single source of truth for performance diagnostics and compliance reporting across heterogeneous AI stacks.

In practice, enabling telemetry is as simple as toggling a few settings in VS Code Copilot. Once activated, the extension streams OpenTelemetry data to an OTLP collector, where tools like the Aspire Dashboard ingest and render it. The dashboard’s trace explorer displays a hierarchical span tree, exposing attributes such as `gen_ai.request.model` and token counts, while a dedicated visualizer reformats raw JSON into a chat‑style conversation view. Metrics panels surface latency histograms and token‑usage distributions, allowing engineers to pinpoint slow model calls or unexpectedly high token consumption before they impact production budgets.

Beyond debugging, standardized GenAI observability drives strategic cost management and governance. By correlating token usage with model pricing, organizations can forecast expenses, enforce usage caps, and negotiate better vendor terms. The open‑source nature of the conventions encourages community contributions, ensuring the schema evolves with emerging AI patterns like tool calling and multi‑modal inputs. As more vendors adopt the spec, enterprises will benefit from interoperable monitoring across clouds, on‑prem, and hybrid environments, turning generative AI from a black box into a measurable, controllable asset.

Inside the LLM Call: GenAI Observability with OpenTelemetry

Companies Mentioned

Why It Matters

Key Takeaways

Pulse Analysis

Ask Pulse AI:

Comments

AI Pulse