Presenters
Source
GrafanaCON Barcelona: A Year of Innovation and the Dawn of Agentic Observability 🚀
GrafanaCON 2024 has landed in Barcelona, and what a kickoff it’s been! This year marks the 10th GrafanaCON, and it’s shaping up to be the biggest and best yet, a far cry from the humble beginnings with just 30 attendees in an office. Grafana Labs co-founders Raj Dutt and Torkel Ödegaard kicked things off with a look back at Grafana’s journey and a thrilling glimpse into the future of observability.
From Barcelona Beginnings to Global Reach 🌍
Did you know Grafana has a secret Spanish connection? Torkel Ödegaard revealed that the foundational features of Grafana were actually written in Barcelona back in 2013 during a Christmas holiday, while he was battling a cold. This unexpected origin story adds a unique flavor to the project, even sparking a playful debate about renaming it “GrafañaCON”!
Grafana Labs is clearly on a mission to make observability accessible to everyone. Beyond the annual GrafanaCON, they hosted over 7,000 workshop signups and 80,000 webinar attendees last year, showcasing an incredible global marketing and events team. This year’s event is dedicated to celebrating the Grafana community, sharing updates on open-source projects, and exploring how these tools solve new challenges. The community itself is booming, with over 100 Grafana Champions worldwide actively sharing their knowledge.
Grafana Labs: An Open Strategy for a Connected World 🌐
“Open” is more than just open-source software for Grafana Labs; it’s a core strategy encompassing open standards like OpenTelemetry and embracing diverse ecosystems like Prometheus, Elastic, and ClickHouse. Grafana acts as the connective tissue, bringing these disparate worlds together. This open ethos extends to their culture, fostering transparency internally and externally, even sharing the difficult lessons learned from security incidents and outages.
The Grafana Journey: From No Customers to Global Impact 📈
Raj Dutt shared the remarkable journey of Grafana Labs, evolving from a time with no customers or revenue to its current standing.
- Step 1: The Default Dashboard: The early days were about building open-source software and making it popular, a period Raj fondly misses for its simplicity.
- Step 3 & 4: Making Observability Easier: The current focus is on simplifying observability for every software team, a key theme for this year’s conference.
- The Agentic Paradigm Shift: A significant recent development is the belief that agents will become primary consumers of observability data, fundamentally changing how we interact with and operate software. Grafana Labs aims to be the leading agentic observability cloud.
- Beyond IT Observability: The vision extends to applying observability principles to the entire business, looking at metrics like revenue and LTV, and seeing a convergence of business and observability data.
This journey has been a shared one, fueled by invaluable feedback, pull requests, and contributions from the global Grafana community.
Astonishing Growth and Industry Recognition ✨
The numbers speak for themselves:
- 35 million active Grafana users worldwide.
- Over 1 million companies using Grafana.
- Grafana Labs has surpassed $400 million in annualized recurring revenue.
- The largest privately held observability company globally.
- Over 1,600 “Grafanistas” across 40 countries.
The company’s evolution has also been recognized by industry analysts. Grafana Labs made its debut on Gartner’s Magic Quadrant for observability three years ago and has since climbed to the furthest right position for completeness of vision, a testament to their collaborative development and community-driven progress.
What’s New in Grafana 13: Easy, Scalable, Everywhere 🛠️
David Kalsmit, VP of Engineering, took the stage to unveil the exciting advancements in Grafana 13, focusing on three key themes: easy to get started, built for scale, and available everywhere.
Enhancements for a Seamless Experience 💡
- 170 Data Sources & 120 Visualizations: The catalog continues to expand, offering unparalleled flexibility.
- Graphviz Panel (Private Preview): Visualize flowcharts and complex diagrams directly within Grafana, powered by the Graphviz language.
- Visual Refresh: A modern visual overhaul is underway, starting with the gauge panel.
- Improved Annotations: Multi-row annotations with clustering make it easier to manage and interpret events like deployments, incidents, and alerts.
- Dynamic Dashboards (GA): Now generally available, dynamic dashboards with conditional rendering and tab layouts offer enhanced interactivity and organization.
Empowering Teams and Organizations 👨💻
- Dashboard Templates: Jumpstart dashboard creation with pre-built templates for common methodologies like DORA, USE, RED, and Golden Signals. The ability to define organizational templates is on the horizon.
- Saved Queries: Promote consistency and efficiency by defining and sharing reusable queries across teams, empowering less experienced users.
- Interactive Learning Paths: An in-product help system guides users through workflows and provides instant assistance, with customizable paths for organizational onboarding.
Operations and Scalability at its Finest 🦾
- GitSync (Production Ready): Manage dashboards as code with a robust two-way integration, enabling pull requests, reviews, and automated updates. This required a major data model and architecture rework but now offers production-grade reliability and disaster recovery. Connectors for GitHub, GitLab, Bitbucket, and PureGit are supported.
- Google Scale: Grafana is now being used internally at Google for SRE purposes, a significant endorsement that has also driven enhancements in dashboard reusability.
- Grafana Marketplace Pilot: An expanding catalog of apps and plugins allows for greater ecosystem participation, with a call for feedback on desired apps and partnership opportunities.
Loki’s Architectural Overhaul: Faster, Smarter Logging 🔍
Poyzan, leading EMEA Group in Loki, unveiled Loki’s new architecture, designed to address the evolving landscape of structured logging and increasing data volumes.
The Challenges of Modern Logging 📈
- Structured Logging & OpenTelemetry: These trends are leading to larger log volumes with more fields per log line.
- Targeted Queries: Users now perform complex analytical queries over massive datasets, drilling down by service, aggregating by status codes, and extracting business insights from custom fields.
- Performance Bottlenecks: Slow or non-returning queries are a significant frustration.
Rebuilding for Performance and Scale 🏗️
The new Loki architecture introduces three major changes:
- Ingestion Pipeline with Kafka: Separates reads and writes using Kafka, ensuring durable writes before ingestors and eliminating the operational headache of stalled writes due to heavy queries. This also addresses data duplication issues, with the goal of writing each log line only once.
- New Query Engine: Filters data closer to the storage layer, preventing irrelevant data from entering the pipeline. A scheduler distributes work across a pool of workers for dramatically faster results on large aggregation and drill-down queries.
- Columnar Storage (Columnar): Natively supports selective reading. Instead of parsing every log line, data is organized by field into columns, allowing for direct retrieval of specific information. This format is still schemaless, with Loki organizing the data automatically.
Remarkable Performance Gains 📊
Internal testing on a massive Grafana Cloud instance shows:
- 20x less data scanned.
- 10x faster query times.
While “needle in a haystack” queries without stream selectors can still be slow, Grafana Labs has acquired technology for targeted secondary indexes, promising up to a 99% reduction in byte scans, currently in private preview for cloud and coming to open source this year.
Note: The new architecture components will be available across all Loki modes, with a commitment to a seamless single-binary experience for smaller deployments.
OpenTelemetry: Aiming for “Boring” Stability 🎯
Ted Young, an OpenTelemetry co-founder, shared the project’s top goal for the year: to be as boring as possible. This means achieving stability across tracing, metrics, and logs, enabling OpenTelemetry to graduate from the CNCF.
Stabilization for Universal Adoption ✅
- Reaching 1.0: The focus is on stabilizing all core components and instrumentation packages to meet enterprise security requirements and ensure availability everywhere.
- Instrumentation Overhaul: The primary challenge lies in rolling out stable semantic conventions to all instrumentation packages across every language. This involves a two-stage rollout: marking defacto stable packages as 1.0 and then lifting data to the latest semantic conventions.
- Package Management Integration: A major push is to simplify installation
through native package management (e.g.,
apt-get install OpenTelemetryon Linux) and Kubernetes operators, making it a one-click, system-wide installation for operators.
Integrated OpenTelemetry: Seamless Installation 📦
The goal is to make OpenTelemetry installation as easy as installing any other software, reducing the complexity for both developers and operators. This integrated approach aims for consistent rollout across organizations.
AI Takes Center Stage: Actually Useful Intelligence 🤖
Mat Ryer and Sven Grossman showcased Grafana Labs’ commitment to building a first AI-native platform, emphasizing actually useful AI over hype.
Grafana Assistant: Your AI Observability Co-pilot 💬
- Deep Integration: The Grafana Assistant, a sidebar chat app, is deeply integrated into Grafana, helping users write complex queries, investigate issues, and build dashboards using natural language.
- Agentic Capabilities: It leverages LLMs to gather context, investigate issues across metrics, logs, tracing, and profiling, and build deep links to specific views.
- Assistant Investigations: Swarm multiple agents around a problem for comprehensive analysis.
- Customizable Agents: Control agent behavior, answers, and actions, ensuring they adhere to organizational standards.
- Automations: Schedule regular tasks, like generating incident reports with common themes and root causes, all through natural language prompts.
- Availability Everywhere: Grafana Assistant is now available for Grafana OSS and Enterprise users, not just Grafana Cloud. It connects to Grafana Cloud for LLM communication, with data remaining on-premises.
GCX: Grafana Cloud CLI for Developers 💻
Ward Becker introduced GCX, the Grafana Cloud CLI, bringing the power of Grafana Cloud and the Assistant to the command line and agentic coding environments.
- Bridging the Gap: GCX bridges the gap between local development and observability insights, allowing developers to access metrics, logs, and synthetic monitoring data directly from their terminal.
- Agentic Coding Integration: It seamlessly integrates with tools like GitHub Copilot and Cursor, providing agents with real-time production context to fix issues and improve code.
- Automated Fixes: GCX enables agents to fetch investigation context, identify issues in code, and even initiate deployments, reducing the time from alert to resolution.
AI Observability: Understanding Your Agents 🧠
- Open Observability Benchmark: Grafana Labs is establishing industry standards for LLM performance in observability tasks.
- AI Observability App (Grafana Cloud): A new solution for building and monitoring agentic applications, providing a 10,000-foot view of cost and performance, and drilling down into agent conversations for forensic detail.
- Key Features:
- Metrics Insights: Spot slow tool calls, expensive agents, and drill into conversations.
- Conversation Debugging: Debug down to sub-agent, tool call, and token level.
- System Prompt & Tool Analysis: Analyze prompts and tool definitions based on real user conversations.
- Online Evaluations: Implement real-time continuous evaluations of agents in production, monitoring the impact of prompt changes.
The Future is Agentic: Democratizing Observability 🌟
Raj Dutt and Torkel Ödegaard closed the keynote by reinforcing Grafana’s core mission of democratizing metrics and data visualization. The advancements in Grafana 13 and the new AI features are realizing this mission in ways previously unimaginable.
The commitment to “actually useful AI” is paramount, with a focus on engineering-driven solutions that provide real value. The new Loki engine and Grafana schema are not just for AI but are crucial for the iterative, hypothesis-driven nature of agentic workflows.
The most exciting announcement is that Grafana Assistant is now available for free to all 35 million Grafana users worldwide, extending its power beyond the “1%” to the “99%”. This bold move underscores Grafana’s dedication to making advanced observability tools accessible to everyone.
GrafanaCON Barcelona has set a clear vision for the future: a world where observability is easier to access, more scalable than ever, and intelligently augmented by AI, all built on a foundation of openness and community.