Search code examples
rrdtool

Differences of using the rrdtool RRD PDP or RRA consolidation function to calculate the average reading?


I'm updating a rrdtool round-robin database with 1 minute intervals. I want to store the average value of five updates as one RRA entry in rrdtool RRD. One way to do this is like this:

$ rrdtool create foo.rrd --start 1000000500 --step 60 \
> DS:ping:GAUGE:120:0:1000 RRA:AVERAGE:0.5:5:12; \
> rrdtool update foo.rrd 1000000560:10 1000000620:20 \
> 1000000680:30 1000000740:40 1000000800:50

It accumulates five readings and stores the average of those as an entry in RRA. However, one could achieve the same with this:

$ rrdtool create bar.rrd --start 1000000500 --step 300 \
> DS:ping:GAUGE:600:0:1000 RRA:AVERAGE:0.5:1:12; \
> rrdtool update bar.rrd 1000000560:10 1000000620:20 \
> 1000000680:30 1000000740:40 1000000800:50

As seen above, step is 300 seconds, but as RRD PDP accepts values between the intervals and calculates the average, then both examples store 30((10+20+30+40+50)/5) in RRA. One difference, which I can tell, is that first example requires at least three updates to store an entry to RRA while in case of the second example, a single update within 300 second step is enough. Are there any other differences?


Solution

  • These two examples are not really the same thing under the covers, though they can appear the same in some circumstances.

    In the first, you have a 60s step, and your RRA stores the average of 5 PDPs in each CDP.

    In the second, you have a 300s step, and your RRA stores each PDP as a CDP.

    Here are some differences:

    • In the first, you will need at least one sample (PDP) every 2 minutes; so three to cover each CDP in the RRA. In the second, you need a single sample every CDP.
    • In the first, data normalisation of each sample happens over a 60s window. In the second, it happens over a 300s window. This will make things look different when the samples arrive irregularly.
    • In the first, you can have up to 120s with no data before you get an Unknown; in the second, up to 600s.

    While the RRA outcome is pretty much the same in both, which you choose would depend on the nature of your incoming data (how often you get samples, how irregular these are), and your storage and display requirements (if you need higher granularity stored or displayed). The first option is more accurate if you have frequent samples; the second is less storage and less work but may sacrifice some data if you have updates more frequent than the step.

    Note that, if you have other RRA types than just AVG, having a smaller step will make calculations more accurate.

    In general, I would recommend that you set the step to be close to the expected average sample frequency,with a latency setting according to how regular the data are. Set your RRA consolodation depending on how you need to view and display the data, and how long you are required to hold history for. Create RRAs corresponding to normal display rollups, in order to minimise the amount of on-the-fly calculations.