Search code examples
postgresql

Page inspection using heap_page_item() after pruning the page


I was reading the Postgres Internals book and in Page Pruning section it was showing an example on how Postgres cleans up the page when it accommodates enough dead tuples, I tried the example it was showing the expected result.

For context, the example was very similar to this

CREATE TABLE messages (
id INTEGER,
content CHAR(2000));

CREATE INDEX messages_id ON messages(id);
CREATE INDEX messages_content ON messages(content);

INSERT INTO messages (id,content) VALUES (1, 'A');

UPDATE messages SET content = 'B' WHERE id = 1;
UPDATE messages SET content = 'C' WHERE id = 1;
UPDATE messages SET content = 'D' WHERE id = 1;
UPDATE messages SET content = 'E' WHERE id = 1;

SELECT lp, t_ctid, t_data, CASE
       WHEN t_xmax = 0 THEN 'live'
       ELSE 'dead'
       END AS status
FROM heap_page_items(get_raw_page('messages', 0));

SELECT * FROM bt_page_items('messages_id', 1);

Output: first example table inspection result

first example index inspection result

The book mentions that Postgres did prune the page as it reached the maximum page size and need to clean up dead rows, also says that the remaining heap tuples are physically moved towards the highest addresses so the free space is aggregated, and the index page stayed the same because pruning only operates on a single page at a time, which are all fine.

My problem is when I tried to modify the example and did the following:

CREATE TABLE messages (
id SERIAL PRIMARY KEY,
content CHAR(2000));

INSERT INTO messages (content) VALUES ('A');

UPDATE messages SET content = 'B' WHERE id = 1;
UPDATE messages SET content = 'C' WHERE id = 1;
UPDATE messages SET content = 'D' WHERE id = 1;
UPDATE messages SET content = 'E' WHERE id = 1;

SELECT lp, t_ctid, t_data, CASE
       WHEN t_xmax = 0 THEN 'live'
       ELSE 'dead'
       END AS status
FROM heap_page_items(get_raw_page('messages', 0));

SELECT * FROM bt_page_items('messages_pkey', 1);

Output: second example table inspection result

second example index inspection result

I have modified the id to be an incremental primary key, which by default also creates a primary key index, the result is so different as you can see, and it does not follow anything the book mentioned, the ctid and htid in the index is different from the one in the table, free space is not aggregated, and the index does not have any dead tuples. I have no idea why this happened and I'm looking for some clarification.


Solution

  • The difference is the index on content in the first example.

    Because of that index, the updates cannot be HOT updates. Each new row version gets an index entry, which cannot be cleaned up until VACUUM runs. That explains the five index tuples you can see.

    The five line pointers to which the index entries point must be retained, but the last update goes ahead and reclaims the space of the first three table tuples as it finds that the page is over 90% full.


    In the second example, where you have no index on content, all updates are HOT updates, so they create no new index entry. the new “heap-only tuple” created by such an update is only referenced from the ctid of the precious tuple in the HOT chain.

    Since there is no external reference to the line pointers of such a heap-only tuple, each update can “prune” any dead tuples and the line pointers of dead heap-only tuples from the previous updates. The situation you see in your second example is that the last update reused line pointer 2 for the new tuple (line pointer 1 cannot be pruned and reused, since it is referenced from the index).