I am running an Influx database instance which collects data from various sources. Sometimes, this data is simply incorrect and needs to be corrected, otherwise later analyses will fail and/or produce incorrect results. However, InfluxDB's SQL-like dialect does not have an UPDATE capability, and "updating" single data points is only possible by replacing datapoints with the same timestamps using another INSERT.
And for this purpose I am looking for a way to mass-edit selected datapoints inside a measurement, without interrupting the data collection process which may insert new datapoints while I'm editing. Here's what I tried:
SELECT * FROM measurement WHERE time > ...
using the HTTP API and I get a JSON file, but there is no (obvious) way to re-insert this JSON (after editing it) and replace the corrected measurements.influx_inspect
and reimport it, but this is faar too slow. This way, 200MB of raw data get blown up to >1GB of line protocol data which is way overkill to correct 2-3 datapoints.Any other ideas? I'm fine with scripting and editing stuff in a text editor, but I'm not fine with manually creating an INSERT for each datapoint that needs to be corrected.
Since there seems to be no real solution to this, I've hacked up a script to do this. Beware, it's really quick'n'dirty, untested on anything but my own hardware, and contains a lot of assumptions (eg. database on localhost, available Perl, etc).
https://gist.github.com/jensb/9efa234b80024a0e335de760d9a4f3aa