My PHP 7.4 script select a large InnoDB table from a local MySQL 5.7 database and iterates every line using PDO. I'm using unbuffered queries by purpose as the whole table won't fit server memory.
The query is a simple SELECT * FROM table WHERE field_a = ... ORDER BY field_b, field_c
.
The code iterating the result is:
$rows = $db->prepare('redacted');
$rows->bindParam(':redacted', $redacted);
$rows->execute();
foreach($rows as $r) {
if($r->fieldname != redacted)
doWork1();
doWork2();
}
doWork2()
and it always does the following job, using a separate, buffered PDO connection:
Every iteraction doesn't take more than 1-2 seconds, but the whole loop could take hours to complete.
The script works fine, but exactly every 12:00 minutes, PDO throws the following exception:
[03-Feb-2023 08:34:29 Europe/Rome] PHP Fatal error: Uncaught PDOException: SQLSTATE[HY000]: General error: 2006 MySQL server has gone away in redacted:176
Stack trace:
redacted
[03-Feb-2023 08:46:50 Europe/Rome] PHP Fatal error: Uncaught PDOException: SQLSTATE[HY000]: General error: 2006 MySQL server has gone away in redacted:176
Stack trace:
redacted
Line number is always the foreach or the if.
mysqld.log says:
2023-02-03T08:36:33.480661+01:00 781575 [Note] Aborted connection 781575 to db: 'redacted' user: 'redacted' host: 'localhost' (Got timeout writing communication packets)
2023-02-03T08:48:56.534865+01:00 781768 [Note] Aborted connection 781768 to db: 'redacted' user: 'redacted' host: 'localhost' (Got timeout writing communication packets)
Upon restarting the script, it works fine on the same data, so it isn't a problem with values or datatypes. mysqladmin version
shows the server never crashed.
I suppose it is due to a sort of timeout.
I noticed the timestamp in mysql log is about 2 minutes later the PHP fatal error, so the disconnection due to writing timeout is the consequence, not the cause. For this reason, increasing net_read_timeout
and net_write_timeout
(120 seconds), wait_timeout
(8 hours), max_allowed_packet
(256 MB) didn't help. I reviewed carefully all server variables and I didn't have any value near 12 minutes or 720 seconds.
The query is fast (less than 3 seconds) and optimized; no big fields; it uses indexes; the query EXPLAIN is (relevant part):
594802 | 5.67 | Using index condition; Using where; Using filesort
.
I'm not in cases mentioned at https://dev.mysql.com/doc/refman/8.0/en/gone-away.html
The machine is a AWS Lightsail with Linux and cPanel.
# mysqladmin version
mysqladmin Ver 8.42 Distrib 5.7.40, for Linux on x86_64
redacted
Uptime: 10 days 10 hours 34 min 13 sec
redacted
Threads: 6 Questions: 15310531 Slow queries: 0 Opens: 9989 Flush tables: 1 Open tables: 1854 Queries per second avg: 16.972
I enabled query log and inspected carefull around time of the error and all queries are correct.
I was surprised when I was able to replicate this so easily. I created three separate PDO connections to my db server -
Without making any config changes, the thread handling the unbuffered query was being closed by the server after 16 minutes. I was unable to find anything in the session vars that would suggest a timeout at 16 minutes.
The solution in my case was to increase the net_write_timeout to be long enough for the entire unbuffered result to be fetched, row-by-row. I increased it to 30mins, 60mins and then 2 hours. Each time the connection for the unbuffered query result remained open until the specified timeout elapsed (1 - 6 minutes after).
I then set it to 36000 (10 hrs) and it has now been running for just over four hours -
SET SESSION net_write_timeout = 36000
I suspect using smaller buffered queries (num of rows dependent on your average row size) to select the data from the source table would be a better solution than holding this connection open for the unbuffered query for hours on end.
The only recent article I came across that discusses unbuffered queries and the effect of net_write_timeout is this one by Hayden James. It only touched on net_write_timeout in passing.