Search code examples
graphitecollectdgraphite-carbon

Graphite only saves a week's data of a collectd custom exec plugin


I only have a single retention configured in my storage-schemas.conf

[default_rentions]
pattern = .*
#retentions = 60s:7d,120s:31d,600s:365d,30m:9y
retentions = 15s:7d,5m:30d,15m:10y

All other collectd data is saved as expected, however any custom metric gathered via the Exec plugin is only saved for just a week.

collectd.conf config:

LoadPlugin exec

<Plugin exec>
    Exec "centos:centos" "/etc/collectd/site-benchmarks.pl"
</Plugin>

I've already tried forcefully manually resizing the existing whisper files using the retention times that I want it to be, but that didn't seem to fix the issue.

find ./ -type f -name '*.wsp' -exec whisper-resize.py --nobackup {} 15s:7d 5m:30d 15m:10y \;

I've also already tried removing the corresponding *.wsp files so that graphite builds them scratch, and that also didn't help. All new custom exec collected data is still only being saved for a week.

Anyone have any ideas on why only custom collectd exec plugin's data is saved with a retention of a week?

Update: I've even verified the updated retention configs have been applied to the whisper files. Example test wsp file whisper metadata output:

[centos@ip-172-16-16-124 apache-response-time]$ whisper-dump.py gauge-test.wsp
Meta data:
  aggregation method: average
  max retention: 315360000
  xFilesFactor: 0.5

Archive 0 info:
  offset: 52
  seconds per point: 15
  points: 40320
  retention: 604800
  size: 483840

Archive 1 info:
  offset: 483892
  seconds per point: 300
  points: 8640
  retention: 2592000
  size: 103680

Archive 2 info:
  offset: 587572
  seconds per point: 900
  points: 350400
  retention: 315360000
  size: 4204800

Solution

  • How long does the executed script take to run?

    It might be a possibility that points are committed slower than once every 15 seconds which results in missing points. If there are not enough points carbon will refuse to aggregate the data. If that is the case you could lower the xFilesFactor which describes how many datapoints there must be present to aggregate.