In a Django web app with a postgresql backend (and nginx reverse proxy with gunicorn application server), I'm seeing dozens of COMMIT
messages in postgresql's slow log. Behold:
2020-02-01 17:56:16.335 UTC [19424] ubuntu@app LOG LOG: duration: 175.630 ms statement: COMMIT
2020-02-01 17:56:21.355 UTC [19435] ubuntu@app LOG LOG: duration: 107.735 ms statement: COMMIT
2020-02-01 17:57:22.592 UTC [19419] ubuntu@app LOG LOG: duration: 235.313 ms statement: COMMIT
2020-02-01 17:57:30.685 UTC [19419] ubuntu@app LOG LOG: duration: 249.875 ms statement: COMMIT
2020-02-01 17:57:30.688 UTC [19424] ubuntu@app LOG LOG: duration: 99.049 ms statement: COMMIT
2020-02-01 17:57:30.688 UTC [19435] ubuntu@app LOG LOG: duration: 115.772 ms statement: COMMIT
2020-02-01 17:57:30.688 UTC [19554] ubuntu@app LOG LOG: duration: 248.656 ms statement: COMMIT
2020-02-01 17:58:03.266 UTC [19435] ubuntu@app LOG LOG: duration: 780.232 ms statement: COMMIT
2020-02-01 17:58:03.270 UTC [19424] ubuntu@app LOG LOG: duration: 622.424 ms statement: COMMIT
2020-02-01 17:58:07.579 UTC [19435] ubuntu@app LOG LOG: duration: 75.658 ms statement: COMMIT
The DB in question was just migrated from one dedicated server to another a day ago. In its previously environment, COMMIT
never showed up in slow log
. In the new environment, I made some changes:
1) I set checkpoint_completion_target = 0.7
(down from checkpoint_completion_target = 0.8
previously)
2) I switched to gevent
workers in gunicorn (previously used sync
workers). This also entailed I add the following to postgresql (allowing us to use psycopg2
asynchronously with gevent
- source):
from psycogreen.gevent import patch_psycopg
def post_fork(server, worker):
from gevent import monkey
patch_psycopg()
worker.log.info("Made PostgreSQL Green!")
monkey.patch_all()
3) Increased max_connections
to 400
(from 300
previously)
4) Increased shared_buffers
to 8GB
(from 6GB
previously)
And also a few other changes where I increased the thresholds (which ought to be a good thing).
Can an expert suggest whether the changes I've made above might have led to all the COMMIT statements littering my slow log
? If not, any other guesses?
And most importantly, what are some mitigation steps I can take to ameliorate this situation?
vmstat 1
yields:
ubuntu@main-app:~$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
16 0 0 10031160 263796 11650344 0 0 2 20 5 4 13 1 86 0 0
2 0 0 10030184 263796 11650308 0 0 0 1832 8598 9433 20 1 78 0 0
4 1 0 10028432 263796 11650416 0 0 40 436 7229 7887 20 1 79 0 0
6 0 0 10027076 263796 11650456 0 0 0 764 7836 8833 17 1 81 0 0
4 0 0 10025904 263796 11650464 0 0 0 588 7947 9754 22 1 76 0 0
4 0 0 10025988 263796 11650512 0 0 0 652 9727 12033 30 2 68 0 0
7 0 0 10025572 263796 11650532 0 0 0 816 8296 9984 25 1 73 0 0
2 1 0 10025680 263796 11650596 0 0 0 1128 8794 11003 23 1 75 0 0
0 0 0 10025552 263796 11650588 0 0 0 288 7153 8091 20 1 79 0 0
6 0 0 10025096 263796 11650612 0 0 0 412 9423 12016 25 2 73 0 0
1 0 0 10025056 263796 11650640 0 0 0 240 9227 11442 32 2 66 0 0
6 0 0 10025056 263796 11650800 0 0 32 1036 8762 10418 25 2 73 0 0
2 0 0 10025116 263796 11650828 0 0 0 352 8730 10924 23 2 75 0 0
6 0 0 10024992 263796 11650940 0 0 0 592 7920 9399 14 1 85 0 0
3 0 0 10024288 263796 11650952 0 0 0 380 8380 9662 23 1 75 0 0
4 0 0 10024536 263796 11650896 0 0 0 680 9193 10819 22 1 76 0 0
1 0 0 10024720 263796 11650776 0 0 0 588 9655 10757 24 2 74 0 0
5 0 0 10024820 263796 11650700 0 0 0 48 10237 13216 28 2 70 0 0
1 0 0 10023660 263796 11650716 0 0 0 396 9291 11251 34 2 64 0 0
6 0 0 10024564 263796 11650744 0 0 0 720 8557 10500 22 1 76 0 0
There are two likely explanations for slow commits:
Overloaded storage. This can be caused by
The high I/O volume.
Lots of small transactions, where the WAL sync requests are just too many.
WITH HOLD
cursors on larger queries.
On Linux, check the %iowait
column in vmstat 1
to see if the I/O subsystem is overloaded.
About your measures:
Increasing max_connections
or decreasing checkpoint_completion_target
will have an adverse effect, if anything.
Increasing shared_buffers
will help if the problem is the amount of read I/O.
If the problem are the many sync requests, and you can afford to lose a few committed transactions in the event of a crash, set synchronous_commit = off
.
If that is not an option, you can play with commit_delay
to get the I/O load down.
I have never heard of "gevent workers", so I cannot say anything about them.