Search code examples
sqldatabasepostgresqlrelational-databaseamazon-aurora

Postgres insert suddenly hangs for a long time after about 100 million records


I am trying to insert 300,000,000 records into an AWS aurora postgres database table. The table has about 170 columns.

I am using a tool which uses multiple threads to insert in parallel.

After about 92,000,000 records, the insertions suddenly hangs (before hanging, tool inserts about 1500 records per second), and a single insertion (even when executed through psql) takes about 30 minutes to complete.

  • If I truncate the table and retry, or use a new aurora instance, the same happens exactly around the same record count.
  • Restarting the DB doesn't have an effect.
  • Increasing the DB instance size doesn't have an effect.
  • Removing all the indexes and constraints (except for the primary key) didn't have an effect.
  • Insertions into other tables work fine while the insertions into the original table hangs.
  • If I use a non aurora postgres RDS, the same issue occurs at around 85,000,000 records.

Following log entry is logged repeatedly from the time the issue occurred.

2023-08-21 13:36:49 UTC:10.142.211.20(36632):mit@postgres:[13997]:LOG:  still searching for an unused OID in relation "pg_toast_16720"
2023-08-21 13:36:49 UTC:10.142.211.20(36632):mit@postgres:[13997]:DETAIL:  OID candidates have been checked 1000000 times, but no unused OID has been found yet.

Following is a screenshot of RDS performance Insights dashboard enter image description here

What could be causing this sudden slowness of inserts?


Solution

  • This has caused by reaching the max limit of TOASTs for the table.

    I could insert the desired number of records by changing the storage to PLAIN of all varchar columns.

    ALTER TABLE <table> ALTER COLUMN <column> SET STORAGE PLAIN;