I'm trying out the graphite emitter plugin in druid to collect certain druid metrics in graphite during druid performance tests. The intent is to then query these metrics using the REST API provided by graphite in order to characterize the performance of the deployment.
However, the numbers returned by graphite don't make sense. So, I wanted to check if I'm interpreting the results in the right manner.
Setup
ingest.rows.output metric
from graphite using the following call:curl "http://Graphite_IP:Graphite_Port>/render/?target=druid.test.ingest.rows.output&format=csv"
druid.test.ingest.rows.output,2017-02-22 01:11:00,0.0 druid.test.ingest.rows.output,2017-02-22 01:12:00,152.4 druid.test.ingest.rows.output,2017-02-22 01:13:00,97.0 druid.test.ingest.rows.output,2017-02-22 01:14:00,0.0
I don't know how these numbers need to be interpreted:
Questions
Thanks in advance,
Jithin
I figured the issue after some experimentation. Since my kafka topic has multiple partitions, druid runs multiple tasks to index the kafka data (one task per partition). Each of these tasks reports various metrics at regular intervals. For each metric, the number obtained from graphite for each time interval is the average of the values reported by all the tasks for the metric in that interval. In my case above, had the aggregation function been sum (instead of average), the value obtained from graphite would have been 5000.
However, I wasn't able to figure out whether the averaging is done by the graphite-emitter
druid plugin or by graphite.