Search code examples
postgresqldockershared-memorypostgresql-15

Handling Out of Shared Memory Error in PostgreSQL (container) with 80K Sub Partitioned Tables


I have PostgreSql 15.3 running as a docker container. My docker run configuration is -m 512g --memory-swap 512g --shm-size=16g

Using this configuration, I loaded 36B rows, taking up about 30T between tables and indexes. I loaded with about 24 parallel operations, and the pg driver was using connection pooling, so there were about 150 concurrent connections but not all active. My postgresql.conf was set to

max_connections = 512 
max_locks_per_transaction = 1024

All of this ran without any problem.

The 36B row table is sub-partitioned. The layer with actual data has about 80,000 tables. Parent, partitioned by category, each partitioned by year, each partitioned by month, each partitioned by sensor id, which is the 80K tables of data.

My problem is that for the life of me, I cannot perform a simple COUNT(*) on the top level table without the out of shared memory, you might need to increase max_locks_per_connection. In fact, I can count from a single table and from the month partition, but not from the year partition, which is only about 10K tables.

Reading that I need the lock table to hold max_connection * max_locks in shared memory, I've tried lowering the max_connections, and increasing both max_locks and shared memory but without success. Currently I am at -m 512g --memory-swap 512g --shm-size=64g for the container and

max_connections = 8 
max_locks_per_transaction = 163840

in the config, having gone by increments of course.

I do have a second table partitioned the same way, much smaller volume of data, but also 80K tables with data. From what I've read, something like

max_connections = 100 
max_locks_per_transaction = 1600

should cover 160K tables, and if each lock takes 168 bytes, the shared memory needs only at least 26M.

Any suggestions on what I should be modifying and how I should be calculating the target values?


Solution

  • Having a ridiculous number of partitions is going to require a ridiculous number of locks. The number of max_connections * max_locks_per_connection you show should be able to handle 80000 tables in a simple case of unindexed tables with flat partitions, but indexes also need to be locked during planning even if they don't end up being used in the execution. The deep partitioning structure will also require some more locks, but without a test script to reproduce your exact structure I haven't tested to see how many it would need.

    Handling partitions gets more and more unwieldy the more partitions you have. That is going to put a practical limit on the number of partitions you can have long before you reach a hard limit. I would say you are already over the practical limit.

    These problems are just internal to PostgreSQL. If you were hitting memory problems at the OS/VM level, I would expect those to fail with different error messages than the one you see.

    The locking issues should be the same whether the tables are populated or just exist but empty, so if you need to test things it should be easy to do just be exporting the structure without the data (like pg_dump -s).