Monitoring Kafka with Burrow: How & Best Practices

March 10, 2025
AutoMQ Team
9 min read
Monitoring Kafka with Burrow: How & Best Practices

Kafka monitoring is essential for maintaining healthy data streaming ecosystems, and Burrow stands out as a purpose-built monitoring solution for tracking consumer lag without relying on arbitrary thresholds. Created by LinkedIn and released as open-source software, Burrow has become a crucial tool for organizations relying on Kafka for their streaming data needs. This comprehensive guide explores Burrow's architecture, setup process, configuration options, best practices, and integration capabilities to help you effectively monitor your Kafka infrastructure.

Understanding Burrow and Its Importance

Burrow fundamentally transforms how we approach Kafka consumer monitoring by providing an objective view of consumer status based on offset commitments and broker state. Unlike traditional monitoring solutions that rely on fixed thresholds, Burrow evaluates consumer behavior over a sliding window, making it more effective at detecting real problems while reducing false alarms.

Core Concepts of Burrow

Burrow operates by consuming the special internal Kafka topic to which consumer offsets are written, providing a centralized service that monitors all consumers across all partitions they consume. It examines several crucial factors:

  1. Whether consumers are committing offsets

  2. If consumer offset commits are increasing

  3. Whether lag is increasing

  4. If lag is increasing consistently or fluctuating

Based on this evaluation, Burrow assigns each partition a status (OK, WARNING, or ERROR) and then distills these individual statuses into a single consumer group status, providing a holistic view of consumer health.

Key Features of Burrow

Burrow offers a robust set of features that make it particularly valuable for Kafka monitoring:

Burrow Architecture and Components

Burrow employs a modular design that separates responsibilities into distinct subsystems, each handling a specific aspect of monitoring:

Clusters Subsystem

The Clusters subsystem runs Kafka clients that periodically update topic lists and track the current HEAD offset (most recent offset) for every partition. This provides the baseline for measuring consumer lag.

Consumers Subsystem

The Consumers subsystem fetches information about consumer groups from repositories like Kafka clusters (consuming the __consumer_offsets topic) or Zookeeper. This data includes details about which consumers are active and what offsets they've committed.

Storage Subsystem

The Storage subsystem maintains all the information gathered by the Clusters and Consumers subsystems. It provides this data to other subsystems when requested for evaluation and notification purposes.

Evaluator Subsystem

The Evaluator subsystem retrieves information from Storage for specific consumer groups and calculates their status following consumer lag evaluation rules. This is where Burrow's threshold-free approach is implemented.

Notifier Subsystem

The Notifier subsystem periodically requests status information on consumer groups and sends notifications (via Email, HTTP, or other methods) for groups meeting configured criteria. This enables proactive monitoring and alerts.

HTTP Server Subsystem

The HTTP Server subsystem provides an API interface for retrieving information about clusters and consumers, making Burrow's data accessible to external systems and dashboards.

Setting Up and Configuring Burrow

Setting up Burrow involves several steps, with multiple deployment options available depending on your environment and requirements.

Using Docker Compose

The simplest way to get started with Burrow is using Docker Compose:

bash
docker-compose up --build -d

This command uses the Docker Compose file to install Apache ZooKeeper, Kafka, and Burrow, with some test topics created by default.

Building from Source with Go

As Burrow is written in Go, you can also build it from source:

  1. Install Go on your system

  2. Download Burrow from the GitHub repository

  3. Configure Burrow using a YAML file

  4. Build and run the binary:

bash
export GO111MODULE=on go mod tidy go install $GOPATH/bin/Burrow --config-dir /path/containing/config

Basic Configuration

Burrow uses the viper configuration framework for Golang applications. A minimal configuration file might look like this:

yaml
zookeeper: servers: - "localhost:2181" timeout: 3 kafka: brokers: - "localhost:9092" burrow: logdir: /var/log/burrow storage: local: path: /var/lib/burrow client-id: burrow-client cluster-name: local consumer-groups: - "burrow-test-consumer-group" httpserver: address: "localhost:8000"

This configuration connects to a local Zookeeper and Kafka instance, specifies storage locations, and sets up a basic HTTP server.

Advanced Configuration Options

For production deployments, you might want to configure additional features:

  1. Security settings for SASL and SSL

  2. Multiple Kafka clusters for centralized monitoring

  3. Notification configurations for alerts via email or HTTP

  4. Customized HTTP API settings for integration with other systems

Interacting with Burrow

Once Burrow is running, you can interact with it in several ways to monitor your Kafka infrastructure.

Using the REST API

Burrow exposes several REST API endpoints that provide information about Kafka clusters, consumer groups, and their status. Some key endpoints include:

The most important endpoint for monitoring is the consumer group status endpoint, which returns detailed information about the status of a consumer group, including which partitions are lagging.

Using Dashboard UIs

While Burrow itself doesn't include a dashboard, several open-source projects provide front-end interfaces:

  1. BurrowUI : A simple dashboard that can be installed via Docker:
bash
docker pull generalmills/burrowui docker run -p 80:3000 -e BURROW_HOME="http://localhost:8000/v3/kafka" -d generalmills/burrowui
  1. Burrow Dashboard : Another option that can be deployed with:
bash
docker pull joway/burrow-dashboard docker run --network host -e BURROW_BACKEND=http://localhost:8000 -d -p 80:80 joway/burrow-dashboard:latest

Integration with Monitoring Tools

Burrow can be integrated with popular monitoring tools to enhance visibility and alerting capabilities.

Prometheus and Grafana

Using Burrow Exporter, you can export Burrow metrics to Prometheus and visualize them in Grafana dashboards:

  1. Install Burrow Exporter

  2. Configure it to scrape metrics from Burrow

  3. Set up Prometheus to collect metrics from the exporter

  4. Create Grafana dashboards to visualize the metrics

InfluxDB and Telegraf

The Burrow Telegraf Plugin allows integration with InfluxDB:

toml
[[inputs.burrow]] servers = ["http://localhost:8000"] # api_prefix = "/v3/kafka" # response_timeout = "5s" # concurrent_connections = 20 # clusters_include = [] # clusters_exclude = [] # groups_include = [] # groups_exclude = [] # topics_include = [] # topics_exclude = []

This configuration enables Telegraf to collect Burrow metrics and write them to InfluxDB for visualization and alerting.

Best Practices for Using Burrow

To maximize the effectiveness of Burrow in monitoring your Kafka infrastructure, consider the following best practices:

Deployment Considerations

  1. High Availability : Deploy Burrow with redundancy to avoid monitoring gaps.

  2. Resource Allocation : Ensure Burrow has sufficient resources to monitor all your Kafka clusters.

  3. Security Configuration : Properly secure Burrow's API endpoints, especially in production environments.

  4. Regular Updates : Keep Burrow updated to support newer Kafka versions and fix security vulnerabilities.

Monitoring Strategy

  1. Monitor All Consumer Groups : Let Burrow automatically discover and monitor all consumer groups.

  2. Focus on Critical Groups : Identify and prioritize consumer groups that are critical to your business.

  3. Set Appropriate Window Sizes : Configure evaluation windows based on your message processing patterns.

  4. Implement Notification Filters : Avoid alert fatigue by filtering notifications based on consumer group priority.

Integration with Operations

  1. Centralize Monitoring : Integrate Burrow with your existing monitoring systems for a unified view.

  2. Automate Responses : Where possible, automate responses to common consumer lag issues.

  3. Document Recovery Procedures : Create clear documentation for addressing different types of consumer lag problems.

  4. Regular Testing : Periodically test your monitoring and alerting setup to ensure it works as expected.

Common Issues and Troubleshooting

Despite its robust design, Burrow users may encounter certain issues that require troubleshooting.

Kafka Version Compatibility

Burrow may stop emitting metrics after Kafka upgrades, as seen in the issue where it stopped working after upgrading from Kafka 3.6.x to 3.7.x. If you experience this:

  1. Check that Burrow's kafka-version configuration matches your actual Kafka version

  2. Review Burrow logs for errors like "failed to fetch offsets from broker" or "error in OffsetResponse"

  3. Restart Burrow to reestablish connections to the Kafka cluster

Consumer Lag Calculation Issues

If Burrow is unable to calculate consumer lag for some topics:

  1. Verify that consumers are committing offsets correctly

  2. Check if the low water mark is available for the partitions

  3. Ensure that Burrow has the necessary permissions to access the __consumer_offsets topic

Performance Considerations

For large Kafka deployments with many topics and consumer groups:

  1. Limit concurrent connections using the concurrent_connections configuration

  2. Filter which clusters, groups, or topics to monitor using include/exclude patterns

  3. Adjust the response timeout to avoid timeouts during heavy load periods

Alternative Monitoring Tools

While Burrow is powerful, other monitoring solutions might better suit specific needs:

More Kafka tools can be found here: Top 12 Free Kafka GUI Tools 2025

Conclusion

Burrow represents a significant advancement in Kafka monitoring by eliminating arbitrary thresholds and providing a more nuanced view of consumer health. Its modular architecture, robust API, and integration capabilities make it a valuable tool for organizations relying on Kafka for their streaming data needs.

By following the setup procedures, configuration best practices, and integration strategies outlined in this guide, you can leverage Burrow to gain deep insights into your Kafka infrastructure, proactively address consumer lag issues, and ensure the reliability of your streaming data platform.

Whether you're running a small Kafka deployment or managing a large-scale streaming infrastructure, Burrow's objective monitoring approach and flexible configuration options make it an excellent choice for keeping your Kafka ecosystem healthy and performant.

If you find this content helpful, you might also be interested in our product AutoMQ. AutoMQ is a cloud-native alternative to Kafka by decoupling durability to S3 and EBS. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency. AutoMQ now is source code available on github. Big Companies Worldwide are Using AutoMQ. Check the following case studies to learn more:

Newsletter

Subscribe for the latest on cloud-native streaming data infrastructure, product launches, technical insights, and efficiency optimizations from the AutoMQ team.

Join developers worldwide who leverage AutoMQ's Apache 2.0 licensed platform to simplify streaming data infra. No spam, just actionable content.

I'm not a robot
reCAPTCHA

Never submit confidential or sensitive data (API keys, passwords, credit card numbers, or personal identification information) through this form.