Search code examples
databasepostgresqlquery-plannermvcc

Relpages and reltuples under MVCC


If I understand correctly, under MVCC(multi version concurrency control), dead tuples are left in the page, until the Vacuum comes in and mark them "unused", and until "vacuum full" comes in and reorgnize them to defragment the space -- so we use less space for the same data.

I have a table, which in one environment that hasn't done vacuum full has around:

SELECT relpages, reltuples from pg_class where relname='pg_toast_16450';
 relpages  |  reltuples
-----------+--------------
 544447814 | 6.394397e+06

In another environment that has undergone vacuum full has:

SELECT relpages, reltuples from pg_class where relname='pg_toast_16450';
 relpages |  reltuples
----------+--------------
  2476625 | 4.439228e+06

Looks like relpages does lower dramatically, which matches my understanding. However, reltuples does not. (relpages has a 250X change, yet reltuples has just 1.33X) Does that mean reltuples does not include dead tuples? If that's the case, does the query planner utilizing reltuples to design query plan has a way around dead tuples?


Solution

  • reltuples is an estimate for the number of live rows in a table. As the documentation says,

    It is updated by VACUUM, ANALYZE, and a few DDL commands such as CREATE INDEX.

    So the number an always be slightly off if the last such command (perhaps triggered by autovacuum) has run on the table, and VACUUM (FULL) would fix that.

    However, there is a second thing to consider, since this is a TOAST table: it may contain some entries that belong to dead rows in the table. The TOAST entries of a dead row need not be dead themselves, but VACUUM (FULL) would not copy them, so the number can shrink additionally. I would suspect that this has happened, because the number is reduced by more than the 10% that I would expect otherwise.