Search code examples
databaserrdtoolrrd

Updates to RRD fail after new datasource added


A little background I have a single RRD that exists to hold aggregated values of 1500+ individual RRDs (there are 1500+ devices i am monitoring). I do this so that I do not hit 1500+ RRDs when I am looking to get values from every monitored device that holds the data I am looking for. I am constantly growing this group of monitored devices so I do some xml editing (much like the contrib perl script that adds new datasources to an already existing RRD) to account for my new devices. the update to the RRD happens once an hour.

the RRD was created with this

--step 3600 
--start now 
DS: [$cabinet-totalw] :GAUGE:7200:U:U"
RRA:AVERAGE:0.5:1:4392
RRA:AVERAGE:0.5:24:366
RRA:AVERAGE:0.5:744:36
RRA:MIN:0.5:24:732
RRA:MAX:0.5:24:732

FYI - $cabinet-totalw is in fact a variable in a for loop. The initial build looped through something like 1300 cabinets. I didn't want to list everything here.

The issue As a new device is added to the monitored group, the datasource is added to the aggregation RRD file. However, when the update fires, it doesn't actually update the RRD for some unknown reason. when i do this manually updatev exists with a zero. if i look at xport output, i have NAN for any new datasource data. however, all existing datasources seem to update without issue.

At the moment I'm lost as to why this is happening. Things seem to be in order yet the update to the new RRD datasources does not take. even more interesting is that i've added datasources to this file in the past and have had those update without issue. it just seems to be recent updates do not take.

I should also add that lastupdate does in fact show the ... well last update correctly. so i assume its a lack of RRD knowledge on my part?

ADDITION I wrote a script that grabs the index of the DS i am interested in. I then parse through the output of a rrdtool fetch to find that requested value based on the index per time interval. I found that the values are in fact being updated and stored in the RRD. Interestingly enough, the timestamp is showing 7 mins after an allotted time slot (step is 3600 so data should be stored on the hour). I now believe this to be an interpolation issue?


Solution

  • I found my issue. When i am updating the rrd data in the xml file (after it has been dumped) i was mistakenly adding wrong default values to the ds value and the min/max values. needed to change node values from NaN to 0.0000000000e+00 and min/max values from 0.0000000000e+00 to NaN.

    thanks if anyone was trying to help.