PostgreSQL 9.6 wals management

I'm trying to understand my wals behavior on my PostgreSQL environment. My wal settings are :

wal_keep_segments = 200
max_wal_size = 3GB
min_wal_size = 80MB
archive_command = 'cp %p /PostgreSQL-wal/9.6/pg_xlog/wal_archives/%f' 
archive_timeout = 10
#checkpoint_flush_after = 256kB
#checkpoint_completion_target = 0.5

My wals directory is /PostgreSQL-wal/9.6/pg_xlog/ and my archives directory is PostgreSQL-wal/9.6/pg_xlog/wal_archives. Last night my wals directory storage got full (archive directory also because they are on the same filing system).

I have right now 211 wals in my wals directory :

ls -l /PostgreSQL-wal/9.6/pg_xlog/ | wc -l
212

The only thing that was running during the night are only selects from our monitoring agent. I guess that wal were created because the archive_timeout was very low and they were deleted because the wal_keep_segments was high.

This morning, I set the wal_keep_segments to 100 and I set the archive_timeout to 6 minutes. Now, after setting those settings and starting the cluster wals switch is working fine and I didn't see that many wals were created. However, doesn't the old wals should be deleted automatically ? Can I delete archives safely ?

Solution

max_wal_size is not a hard limit.

When the limit is exceeded, PostgreSQL will trigger a checkpoint, at the end of which it will delete rather than recycle old WAL segments. So pg_xlog (pg_wal in new releases) can still grow until the next checkpoint.

Your setting of 3 GB corresponds to 192 WAL segments, which is lower than your wal_keep_segments setting (corresponds to wal_keep_size in new releases). So PostgreSQL won't even begin to recycle or delete WAL segments until there are 3.125 GB of them.

You should either lower wal_keep_segments or increase the disk space for pg_xlog. Once you have lowered wal_keep_segments, wait for a checkpoint or run CHECKPOINT manually. Then you will see a reduction of the number of WAL segments.