Search code examples
databasepostgresqltime-seriesbackuptimescaledb

How to drop the oldest entries from a remote TimescaleDb, maintaining the full local backup of the database?


I've got a remote ever growing TimescaleDb database. I would like to keep only the most recent entries in the that Db, backing up the rest of the data to local drive, to achieve constant Db size on the server.

I thought of making full pg_dump backups before retaining and rebuilding the base locally from these backups.

Also, I could use WAL-E to create a continuous copy, somehow ignoring the deletions on the remote database.

What would be the most efficient way to achieve that?


Solution

  • (TimescaleDB person here)

    There are two main approaches here:

    • Use a backup system like WAL-E or pgBackRest to continuously replicate data to some other source (like S3).
    • Integrate your use of TimescaleDB's drop_chunks with your data extraction process.

    The answer somewhat depends on how complex your data/database is.

    If you are looking to primary archive your data in a single hypertable, I would recommend the latter: Use show_chunks to determine which chunks are over a certain range, compute a select over their range and write the data wherever, and then execute drop_chunks over the same range.