Search code examples
sqlpostgresqlpostgresql-performancevacuumpostgresql-9.5

Slow performance with small table after extreme reduction of size


I have table with approximately 10 million rows, with the id column being primary key.

Then I delete all rows where id > 10. Only 10 rows remain in the table.

Now, when I run the query SELECT id FROM tablename, execution time is approximately 1.2 - 1.5 seconds.
But SELECT id FROM tablename where id = x only takes 10 - 11 milliseconds.

Why is the first SELECT so slow for just 10 rows?


Solution

  • The main reason is the MVCC model of Postgres, where deleted rows are kept until the system can be sure that the transaction is not rolled back and the dead rows are not visible to any concurrent transaction any more. Only then, dead rows can be physically removed by VACUUM - or more radically VACUUM FULL.

    Related:

    Your simple query SELECT id FROM tablename - if run immediately after the DELETE and before autovacuum can kick in - still finds 10 million rows and has to check visibility, only to rule out most of them.

    Your second query SELECT id FROM tablename where id = x can use the primary key index and only needs to read a single data page from the (formerly) big table. That kind of query is largely immune to the total size of the table either way.

    There may be a (much) more efficient way to delete almost all 10 million rows: