Learn how to monitor Redpanda using Datadog's built-in Redpanda integration.
Datadog is an infrastructure-monitoring software with alerting, metric visualization, anomaly detection, log management, and continuous integration (CI) capabilities. Datadog has 500+ built-in integrations that cover the most frequently used applications and services, including cloud providers, databases, and messaging systems. Redpanda is one of these integrations.
The purpose of this tutorial is to show you how to use Datadog’s Redpanda integration to monitor key metrics and metric groups. The tutorial provides step-by-step instructions for monitoring Redpanda with Datadog. You can follow along with this tutorial and access all code used in the demo in this GitHub repo.
Datadog as a monitoring tool
Datadog’s web user interface for developers has a side navigation menu with many capabilities, including:
- Dashboards for a compact overview of the application and service metrics.
- Integration options for an easier way to set up your application and service integrations.
- Monitors to define alerts and notifications based on metrics, availability, or log patterns.
- An Events Explorer to display the events generated by integrated infrastructure and services.
To combine metrics and logs from a variety of applications and services, Datadog uses Datadog Agent, a type of software that runs on your hosts and collects events and metrics from applications and services, and forwards them to Datadog.
Datadog Agent can be installed on a host or in a containerized environment.
This tutorial provides step-by-step instructions on how to set up Redpanda integration on Datadog Agent. The tutorial will be using a docker-compose
setup to simplify instructions.
After providing the instructions, you’ll also get examples of frequently used metrics and their visualizations.
Prerequisites
You’ll need the following for this tutorial:
- A Datadog account
- Your Datadog API Key (
DD_API_KEY
) - The name of your Datadog site (
DD_SITE
) - A machine with Docker and docker-compose installed
To get the DD_API_KEY
, which is required for submitting metrics and events to Datadog, navigate to Organization Settings under your user tab, where you can find API keys for your organization and create a new API key by clicking +New Key.
You can follow the instructions on Datadog’s API and Application Keys to learn about Datadog API management.
Monitoring Redpanda with Datadog
A real-time event streaming setup typically has three main services—consumers, producers, and the event streaming platform (middleware). With a docker-compose.yml
file, it’s possible to simulate this multi-component setup.
The producer streams event data to Redpanda. There’s an Agent integration with a Docker Datadog Agent, and the consumer consumes the produced streaming events. To see how the components are related to each other, it’s helpful to go over each component of this real-time event streaming setup. The following sections explain each of the components.
Start with Docker Compose
You can find the docker-compose
setup in the demo repository. After cloning the repository, you can set up your .env
file on the same level as your docker-compose.yml
.
The .env
file in this tutorial includes two variables: TOKEN
for reading data from an API and DD_API_KEY
for Datadog:
TOKEN=<API_TOKEN> DATADOG_API_KEY=<DD_API_KEY>
The TOKEN
is used by the producer service as it reads real-time data from Coinranking API. Coinranking is a cryptocurrency information web application. It has an API providing real-time information about cryptocurrencies.
You can create an account from Coinranking to obtain a TOKEN
. If you wish to use a different type of data set, you can also create your own producer and consumer using another real-time data API.
After setting up your .env
file, you can run the following command to start your multi-component application in the detached mode:
docker-compose up -d
Set up Redpanda
Setting up Redpanda on a Docker container first requires pulling the image. Then, you need to start the Redpanda service with the configuration parameters that are explained below.
You don't need to change the parameters. However, it’s useful to know why the docker-compose
Redpanda service uses each parameter in the command:
--overprovisioned
limits the resources. This is for Docker usage.--node-id
defines the unique ID for a node in a Redpanda cluster.--check
enables/disables the checks performed at startup.—kafka-addr
defines the internal Apche KafkaⓇ API address and port.--advertise-kafka-addr
defines the external Kafka API address and port. The consumer is expected to use the external address.--pandaproxy-addr
defines the internal Redpanda REST API address and port.--advertise-pandaproxy-addr
defines the external Redpanda REST API address and port.redpanda.enable_transactions
enables the transactions.redpanda.enable_idempotence
enables the idempotent producer. These two optional configurations are expected to be enabled explicitly.redpanda.auto_create_topics_enabled
enables the topic auto creation mode.
To see more options, check out the Redpanda custom configuration documentation.
Since this tutorial provides you a Redpanda instance in a Docker container, having Datadog’s log collection labels might be useful for log management. You can retrieve these with the following code:
--- version: "3.7" services: redpanda: image: docker.redpanda.com/vectorized/redpanda:v22.1.4 container_name: redpanda networks: - monitoring command: - redpanda start - --overprovisioned - --smp 1 - --memory 1G - --reserve-memory 0M - --node-id 0 - --check=false - --kafka-addr 0.0.0.0:9092 - --advertise-kafka-addr redpanda:9092 - --pandaproxy-addr 0.0.0.0:8082 - --advertise-pandaproxy-addr redpanda:8082 - --set redpanda.enable_transactions=true - --set redpanda.enable_idempotence=true - --set redpanda.auto_create_topics_enabled=true labels: com.datadoghq.ad.logs: '[{"source": "redpanda", "service": "redpanda_cluster"}]' com.datadoghq.ad.check_names: '["redpanda"]' ports: - 9092:9092 - 8081:8081 - 8082:8082 - 9644:9644
In the above code, labels are defined for log collection configuration from the container:
com.datadoghq.ad.logs: '[{"source": "redpanda", "service": "redpanda_cluster"}]'
enables Datadog Agent to see the Redpanda container logs with the specified source and services. This can make it easier to create a custom log manager for the Redpanda service.com.datadoghq.ad.check_names: '["redpanda"]'
enables the auto-discovery feature for the container namedredpanda
.
Note: Running Redpanda directly via Docker is not supported for production usage. This platform should only be used for testing. Additionally, Redpanda integration currently does not provide a default log management pipeline.
However, you can create your custom log parsing pipeline for Redpanda clusters. You can see Datadog documentation for more explanation on creating custom log parsing pipelines.
Set up Datadog
The Datadog Agent is installed in a Docker container:
FROM gcr.io/datadoghq/agent:7 COPY tools/datadog/datadog.yaml /etc/datadog-agent/datadog.yaml COPY tools/redpanda/configuration.yaml /etc/datadog-agent/conf.d/redpanda.d/conf.yaml LABEL "com.datadoghq.ad.logs"='[{"source": "agent", "service": "datadog_agent"}]' RUN agent integration install -r -t datadog-redpanda==1.1.0
In the above code, there are a few important steps:
datadog.yaml
is copied into the Datadog Agent configuration path (/etc/datadog-agent/datadog.yaml
).configuration.yaml
is copied into the Redpanda integration path(/etc/datadog-agent/conf.d/redpanda.d/conf.yaml
).- The container label is defined as
"com.datadoghq.ad.logs"='[{"source": "agent", "service": "datadog_agent"}]'
. - Lastly, the
datadog-redpanda
integration is installed with an Agent command.
The datadog.yml
file should look like the following:
logs_enabled: true site: datadoghq.eu
As you can see, logs are enabled and the Datadog site is defined as the EU location.
The Redpanda configuration configuration.yml
for Datadog is as follows:
instances: - openmetrics_endpoint: http://redpanda:9644/metrics use_openmetrics: true logs: - type: journald source: redpanda
Note: This tutorial is using an older version of Redpanda. Check the Monitoring Docs for the most up-to-date endpoint.
In the above, the instances
tag maps the node metrics endpoint as an openmetrics
endpoint to Datadog. The logs
tag enables the collection of the tags from the source
named redpanda
.
For more option parameters available in the configuration.yaml
, you can see the configuration.example.yml
in this GitHub repo.
The code above explains the three main steps to create a Dockerized Datadog Agent, which can be summarized as follows:
- Defining the
datadog.yml
and copying it into the Datadog Agent configuration path. - Defining the
configuration.yml
and copying it into the Redpanda integration path of Datadog Agent. - Installing the
datadog-redpanda
package in the Agent.
Using the Datadog Agent as a service
By following this tutorial, you’ve created a simplified simulation of a multi-component application, part of which is the monitoring component (Datadog).
Using this same setup, you can build the monitoring component of your docker-compose
with a Dockerfile named Dockerfile.datadog
, which you can use to add more configurations for networking, environment variables, and volume mounting.
The following is an example docker-compose
service for the Datadog Agent:
--- version: "3.7" services: datadog_agent: build: context: . dockerfile: Dockerfile.datadog container_name: datadog_agent volumes: - /var/run/docker.sock:/var/run/docker.sock:ro - /proc/:/host/proc/:ro - /sys/fs/cgroup/:/host/sys/fs/cgroup:ro environment: - DD_API_KEY=${DATADOG_API_KEY} - DD_SITE=datadoghq.eu - DD_LOGS_ENABLED=true - DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true - DD_CONTAINER_EXCLUDE_LOGS=name:datadog_agent depends_on: redpanda: { condition: service_healthy } ports: - "8125:8125/udp" networks: - monitoring
In the above code, you can find a series of environment variables that are specific to Datadog, which can be useful for explicitly configuring your Docker Agent. The environment variables and their explanations are as follows:
DD_API_KEY
sets your account key.DD_SITE
is the Datadog site, setting up your Datadog location.DD_LOGS_ENABLED
is for enabling/disabling your logs (default is false).DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
enables/disables log collection for all containers (default is false).DD_CONTAINER_EXCLUDE_LOGS
excludes specific containers’ logs for collection; for this tutorial’s YAML configuration, the container itself is nameddatadog_agent
and it’s excluded from the log collection.
Producers and consumers
The sections above explain the setup for integration between Redpanda as the messaging middleware and Datadog Agent as a monitoring tool. In addition to those two components, in an event streaming environment you would also expect to have producers and consumers of streaming data.
Producers insert messages into the messaging middleware, Redpanda. Consumers are decoupled from the producers, and they read data from the messaging middleware for extra operations, including inserting the data into a relational or non-relational database. Redpanda decouples producers from consumers in this setup.
Using a Kafka client, you can refer to the broker addresses to programmatically produce and consume messages. The external Kafka API addresses of the broker's nodes will refer to this. The external Kafka API address of the broker is set as redpanda:9092
in this tutorial and has only one node.
As this tutorial focuses on how to integrate Redpanda with Datadog and the resulting monitoring benefits, no details about the clients for Redpanda have been provided. However, you can see the Python producer and consumer written for this setup in the project repository.
Redpanda–Datadog integration
The previous steps explained how to simulate an event streaming setup locally with docker-compose
, demonstrating how to integrate Redpanda and Datadog.
Once you have the Docker Datadog Agent, Redpanda, and a custom consumer and producer working in your setup, go to your Datadog user interface to find the Integrations tab on the sidebar.
Then, you can install or configure your Redpanda integration from the Redpanda integration box.
Redpanda integration provides a Redpanda dashboard, where you’ll find a compact overview of the frequently used metrics and data collected by Datadog. You can find a full list here, but some examples include:
- Latency metrics:
redpanda.kafka.latency_produce_latency_us.sum
: Producer latency in milliseconds.redpanda.kafka.latency_fetch_latency_us.sum
: Consumer latency in milliseconds.
- Throughput metrics:
redpanda.kafka.rpc_received_bytes.count
: Megabytes produced.redpanda.kafka.rpc_sent_bytes.count
: Megabytes received.
- Topic-related metrics:
redpanda.storage.log_written_bytes.count
: Written bytes per topic.redpanda.storage.log_read_bytes.count
: Read bytes per topic.redpanda.cluster.partition_records_produced.count
: Number of records written per topic.redpanda.cluster.partition_records_fetched.count
: Number of records read per topic.
Redpanda integration also comes with a check service for monitoring the health of your Redpanda.
Conclusion
In this article, you learned how to integrate your Datadog Agent with your Redpanda cluster, and you also saw some frequently used metrics that come with the integration.
With the Redpanda–Datadog integration, you can determine critical metrics for monitoring your infrastructure, which you can use to define alerts and notifications, detect anomalies, or see if the Redpanda service is healthy.
As a reminder, you can access the resources for this tutorial in the Github repository here. Check out Redpanda’s source-available code in this repo, or join the Redpanda Community on Slack to ask questions about running this demo.
Let's keep in touch
Subscribe and never miss another blog post, announcement, or community event. We hate spam and will never sell your contact information.