Search code examples
postgresqlconcurrencylockingrdbmsmvcc

MVCC snapshot limit for concurrent queries


I am trying to learn PostgreSQL MVCC architecture. It says that MVCC creates a separate snapshot for each concurrent query. Isn't this approach memory inefficient?

For example if there are 1000 concurrent queries and table size is huge. This will create multiple instances of the table.

Is my understanding correct?


Solution

  • It says that MVCC creates a separate snapshot for each concurrent query. Isn't this approach memory inefficient?

    You could argue it is memory inefficient. It usually isn't a big problem in practise.

    For example if there are 1000 concurrent queries and table size is huge.

    Why would you have/want 1000 concurrent queries? Do you have 1000 CPUs? If there is a risk that you will try to establish 1000 concurrent queries, then you should deploy some entry control mechanism (like a connection pooler) that prevents this from happening, with a fallback to max_connections.

    This will create multiple instances of the table.

    A snapshot is not a copy of the table. Is just a set of information that gets applied to the base table rows dynamically to decide which rows are visible in that snapshot. The size of a snapshot is proportional to number of concurrent transactions (one reason not have 1000 of them), not to the size of the table.