We have a Postgres table with tenants
and multiple foreign key tables with tenant_id
. We would like to archive all the data to an s3 bucket and deleted from the Postgres table whenever a tenant
is set to inactive (is_active = false
) in order to save on storage costs.
I'm wondering what is the best way to do this? I was looking at Database Migration Services but it seems like that would replicate all of the data not just the inactive data. I would also need to just delete the inactive data afterwards not the entire table. I also thought to identify all the tables that need to migrate data and run a script to save the data to an s3 bucket and then delete the data from the source tables if I was able to successfully replicate the data into S3.
Are there any other methods to accomplish this easily? There are quite a bit of tables connected to a tenant
and I want to make sure there isn't an easier way to migrate the inactive data than writing a long script.
You have two choices:
You can Export data from an RDS for PostgreSQL DB instance to Amazon S3 - Amazon Relational Database Service by using the aws_s3
extension. It takes an SQL statement (eg select * from table where tenant_id = 12
) and a bucket and key for where to store the data. It is saved in CSV format (but options
can modify things like delimiters).
So, just run the export command for all tables and rows that you want to keep.