Search code examples
apache-kafkagraphitedruid

Using druid graphite emitter extension


I'm trying out the graphite emitter plugin in druid to collect certain druid metrics in graphite during druid performance tests. The intent is to then query these metrics using the REST API provided by graphite in order to characterize the performance of the deployment.

However, the numbers returned by graphite don't make sense. So, I wanted to check if I'm interpreting the results in the right manner.

Setup

  • The kafka indexing service is used to ingest data from kafka into druid.
  • I've enabled the graphite emitter and provided a whitelist of metrics to collect.
  • Then I pushed 5000 events to the kafka topic being indexed. Using kafka-related tools, I confirmed that the messages are indeed stored in the kafka logs.
  • Next, I retrieved the ingest.rows.output metric from graphite using the following call:

curl "http://Graphite_IP:Graphite_Port>/render/?target=druid.test.ingest.rows.output&format=csv"

  • Following are the results I got:
druid.test.ingest.rows.output,2017-02-22 01:11:00,0.0 
druid.test.ingest.rows.output,2017-02-22 01:12:00,152.4 
druid.test.ingest.rows.output,2017-02-22 01:13:00,97.0 
druid.test.ingest.rows.output,2017-02-22 01:14:00,0.0

I don't know how these numbers need to be interpreted:

Questions

  1. What do the numbers 152.4 and 97.0 in the output indicate?
  2. How can the 'number of rows' be a floating point value like 152.4?
  3. How do these numbers relate to the '5000' messages I pushed to Kafka?

Thanks in advance,

Jithin


Solution

  • I figured the issue after some experimentation. Since my kafka topic has multiple partitions, druid runs multiple tasks to index the kafka data (one task per partition). Each of these tasks reports various metrics at regular intervals. For each metric, the number obtained from graphite for each time interval is the average of the values reported by all the tasks for the metric in that interval. In my case above, had the aggregation function been sum (instead of average), the value obtained from graphite would have been 5000.

    However, I wasn't able to figure out whether the averaging is done by the graphite-emitter druid plugin or by graphite.