I am interested in the implementation details of sequences in PostgreSQL, the reason being that they're used in the background for the SERIAL
type. I have reviewed the source file here (found in Google search: "postgresql source code sequence"), although it is not clear to which version of PG this file relates to. I understand the general code flow in the function nextval_internal
, which is the heart of the issue, but PG's code base is large and I just can't afford the time to research this issue to its fullest.
I'm interested to know:
Sequences are like normal tables in many respects: they are found in the pg_class
catalog and they have a data file, which is where the current value is stored.
In recent versions of PostgreSQL (as a side effect of making ALTER SEQUENCE
fully transactional) you also have to SELECT
from the sequence to see its current value.
Persistence is handled transactionally like in a normal table (even though nextval
is not transactional), so changes go through the transaction log, but different from normal tables there is no multiversioning. Rather, the single row of a sequence is changed “in place”. That's ok because we don't need different versions for the value, and VACUUM
would never be able to keep up with cleaning up in a busy sequence.
To answer your questions individually:
During recovery, changes to the sequence are replayed from WAL just like any other data change.
Using a sequence has very little overhead. Modifications will not cause I/O all the time, because sequences are cached in shared buffers like all other tables. Only during checkpoints they have to be persisted to storage, and there is some small amount of WAL to be accounted for.
If you want to further reduce the overhead of using sequences, define them with a CACHE
greater than one.
There are no problems with using sequences. They are the best way to generate unique numbers.