
Grafana Cloud now offers AI Observability, a unified platform for monitoring large language model (LLM) workloads in production. By integrating the OpenLIT SDK and OpenTelemetry, developers can automatically capture traces, metrics, and logs for multiple model providers, vector databases, and GPU resources. The solution provides real‑time visibility into latency, token usage, cost, and safety evaluations such as hallucination and toxicity detection. Pre‑built dashboards and alerting let teams enforce SLAs and optimize spend without building a custom observability stack.

Grafana has introduced one‑click integrations for its Drilldown apps, enabling users to add panels to dashboards, create alerts, and save searches without leaving the exploration view. The updates also bring an enhanced OpenTelemetry log display that surfaces key metadata inline,...

Grafana Assistant, an AI agent built into Grafana Cloud, now automates cloud cost optimization by translating natural‑language prompts into telemetry queries. It delivers 30‑day waste analyses, actionable recommendations, and transparent data without requiring PromQL expertise. Integrated with Model Context Protocol...

Apono has launched an integration with Grafana that provides Just-in-Time, policy-driven access to the platform’s underlying data sources. The solution continuously discovers data sources such as Elasticsearch, PostgreSQL, and CloudWatch, and grants engineers short-lived permissions based on predefined policies, on-call...

Grafana Tempo’s optional metrics‑generator can derive RED metrics directly from tracing data, eliminating the need for separate instrumentation. However, automatically creating metric series can trigger a cardinality explosion, driving up storage costs. In the Tempo 2.10 release, the team introduced a...

The Grafana "Big Tent" podcast highlighted the rise of agentic AI in observability, featuring Resolve AI’s Spiros Xanthos and Grafana engineers. They discussed how AI agents use knowledge graphs to automate root‑cause analysis and troubleshoot production incidents. A real‑world example...