Search code examples
databasepostgresqlmvcc

Why PostgreSQL indexes do not contain visibility information?


I know that the physical storage in PostgreSQL looks like:

heap table:
<old_tuple, t_xmin, t_xmax>
<new_tuple, t_xmin, t_xmax>
index:
<old_index_value, old_RID>
<new_index_value, new_RID>

So Index-Only Scan need the help of Visibility Map.

My question is: Why not we store the t_xmin, t_xmax in index as well?

like:

index:
<old_index_value, old_RID, t_xmin, t_xmax>
<new_index_value, new_RID, t_xmin, t_xmax>

Solution

  • There is relatively high overhead - t_xmin, t_xmax has 8 bytes together, and probably will have 16 bytes in future. So if Postgres stores these values to index, then almost all numeric indexes will be 2 (bigint) times or 2/3 times (int) bigger.

    Today it's not an problem (probably), but Postgres beginnings are in half of 80 years, and there disc capacity was big problem.

    Second motivation is, probably, complexity of code and ensuring data consistency (without hard locking). Indexes in Postgres was data access accelerators, not source of data. Then the implementation is more simpler. Ingres was designed by very smart professors and students and more robust less complex (but possible slower) design was preferred.