I have an InfluxDB server instance containing several databases, like sensors
, network
, telegraf
and so on.
Together these databases consume several dozens of GB, and I want to offload only the sensors
database to another more powerful machine.
The simplest case would be that I create a new InfluxDB server instance on that other machine, and just move (rsync) the influxdb/data/sensors
folder to the other machine, and delete it from the original one.
While I haven't tested it, I assume that this does not work that easily; there is a data/_internal
directory, then there's the meta/meta.db
file as well as the wal/*
directory, which will probably require everything to be left "as-is" in order for the server instance to boot without error.
Since I'm talking about dozens of GBs per database, I'd ideally just would like to mount a new ssd, copy the files/directories, and then mount that new ssd on the other machine and use it directly as the new data source without further copying.
I'd basically wish I could do this in a way as easy as moving rrd-tool's rrd
files from one machine to another.
Is this possible? If not, what are my options?
Edit 2022: This is a solution which works for InfluxDB 1.x, the commands shown here may not be directly applicable to 2.x. Here is a link to the 2.x backup/restore documentation: https://docs.influxdata.com/influxdb/v2.2/backup-restore/
The InfluxDB 2.2 influx backup command is not compatible with versions of InfluxDB prior to 2.0.0.
I resorted to using influxd backup
/ influxd restore
as Yuri Lachin pointed out.
While it does have the drawback of first needing to save the data on disc and then read it in from there, it seems to be the the most flexible approach.
Rsyncing 50GB does take a certain amount, and the databases would need to be offline during that time, which is not a requirement for backup / restore
; so no data is lost. It also allows to migrate the data which used to be on one single InfluxDB instance to different InfluxDB servers without having to think about the issue with the metadata database.
The backup / restore
can be done in steps, where the first step ist to initially backup all the data of the database, restore it into the new server instance, and then exporting the newest data again which didn't make it into the first backup, restoring it again into the new database.
Step 1:
On the machine containing the new, empty InfluxDB server instance, backup the data from the remote, old InfluxDB instance:
influxd backup \
-portable \
-host 192.168.11.10:8088 \
-database sensors \
/var/lib/influxdb/export-sensors-01
Afterwards import this data into the new server instance:
influxd restore \
-portable \
/var/lib/influxdb/export-sensors-01
Step 2:
Now take the time to adjust the IP-address or domain name to which the InfluxDB clients are currently connected, and make them point to the new InfluxDB server; restart the clients if necessary.
Step 3:
During the time the backup
finished and you restarted the clients with the new IP-address, new data was still written to the old database, so we will need to sync that data over.
Again, on the new server, pull a backup from the old one, but specify the time range of the missing data and a different target directory:
influxd backup \
-portable \
-host 192.168.11.10:8088 \
-database sensors \
-start 2019-06-22T19:30:00Z \
-end 2019-06-24T00:00:00Z \
/var/lib/influxdb/export-sensors-02
Apparently it is important to specify -end
as well, one test I did which had no -end
argument started to backup the entire database again. I just ctrl-d'd out of it and deleted /var/lib/influxdb/export-sensors-02
and started it again with the -end
argument set.
The -start
argument can contain a couple of minutes of the data which already got restored, since during restoring this second backup these duplicated entries will be ignored or overwrite the already existing identical values.
For example, if you start the main backup at 4pm and it finishes at 6pm, the second backup can contain a -start
argument of 5:55pm and an -end
argument a couple of days in the future, which is no problem, because as soon as you switch the IP-addresses of the client, no more future data will be written to the old database. Probably the -since
argument would have been better, but I was experimenting a bit with time ranges so I left it at using -start
+-end
.
In order to insert the missing data which you just backed up into /var/lib/influxdb/export-sensors-02
you need to do a bit more work, since you can't restore
into an already existing database. If you try to do it, nothing is damaged, only a warning message is shown and restore
gets aborted.
So we will need to restore the data into a new, temporary database:
influxd restore \
-portable \
-database sensors \
-newdb sensors_tmp_backup \
/var/lib/influxdb/export-sensors-02
Then copy the data into the sensors
database:
influx \
-database=sensors_tmp_backup \
-execute 'SELECT * INTO sensors..:MEASUREMENT FROM /.*/ GROUP BY *'
And delete the temporary database:
influx \
-database=sensors_tmp_backup \
-execute 'DROP DATABASE sensors_tmp_backup'
If all is OK, delete the backup directories
rm -rf /var/lib/influxdb/export-sensors-01
rm -rf /var/lib/influxdb/export-sensors-02
Before changing the addresses with Step 2, you can test Step 3 a couple of times, by making the new db catch up the old, current one via a couple of backups. It's a good way to get acquainted with the procedure in Step 3.
If you're running InfluxDB in Docker, like I am doing, you can execute all the commands from the host. Step 3 would then look like this:
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 influxd backup -portable -host 192.168.11.10:8088 -database sensors -start 2019-06-22T19:40:00Z -end 2019-06-24T00:00:00Z /var/lib/influxdb/export-sensors-02
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 influxd restore -portable -database sensors -newdb sensors_tmp_back /var/lib/influxdb/export-sensors-02
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 influx -database=sensors_tmp_back -execute 'SELECT * INTO sensors..:MEASUREMENT FROM /.*/ GROUP BY *'
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 influx -database=sensors_tmp_back -execute 'DROP DATABASE sensors_tmp_back'
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 rm -rf /var/lib/influxdb/export-sensors-01
docker exec -w /var/lib/influxdb/ influxdb-1.7.6 rm -rf /var/lib/influxdb/export-sensors-02
If you are having problems accessing the remote InfluxDB server keep in mind that the RPC-port 8088
is usually bound to localhost
for security reasons, so you may need to bind it to 0.0.0.0
first, probably by setting the environment variable INFLUXDB_BIND_ADDRESS
on the remote instance to 0.0.0.0:8088
, as specified in the documentation, and then restarting the server.