I'm using PHP deployer which is a symlink based deployment tool which calls opcache:reset
after deployment.
Recently I'm getting a segfault in my PHP-FPM processes after deployment.
This manifests as segfault entires in the PHP logs/memory exhausted warnings being output or 500/503 errors from Apache.
This has worked fine for years, so I'm stumped as to what has caused this to happen now.
I can reproduce this on every page refresh of an affected pool after the first segfault.
If I restart the PHP-FPM process, the page is served correctly the first time and then subsequently I get a segfault.
I suspect this is related to Opcache, since if I delete the Opcache cache files and restart PHP-FPM, the problem goes away. If I disable opcache.file_cache
, again there is no problem. Note that I'm not exclusively using the file cache.
When I (unscientifically) checked the opcache version of files reported by PHP during the memory exhaustion error, I noticed they seemed to be truncated or missing data that existed in previous versions of the opcache file where the segfault did not occur. When I removed those specific opcache files and restarted PHP-FPM the problem went away. So it seems, I think, that the opcache files are getting corrupted between deploys.
Environment:
My opcache config:
zend_extension=opcache.so
opcache.enable=1
opcache.enable_cli=0
opcache.memory_consumption=96
opcache.interned_strings_buffer=16
opcache.max_accelerated_files=4096
opcache.max_wasted_percentage=5
opcache.validate_timestamps=1
opcache.revalidate_path=0
opcache.revalidate_freq=2
opcache.max_file_size=0
The core dump:
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/php-fpm...Reading symbols from /usr/lib/debug/usr/sbin/php-fpm.debug...done.
done.
[New LWP 6221]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `php-fpm: pool cpsastag '.
Program terminated with signal 11, Segmentation fault.
#0 ZEND_BIND_STATIC_SPEC_CV_UNUSED_HANDLER () at /usr/src/debug/php-7.4.15/Zend/zend_vm_execute.h:46883
46883 zval *value;
(gdb) bt
#0 ZEND_BIND_STATIC_SPEC_CV_UNUSED_HANDLER () at /usr/src/debug/php-7.4.15/Zend/zend_vm_execute.h:46883
#1 0x00005559a7261159 in execute_ex (ex=0x0) at /usr/src/debug/php-7.4.15/Zend/zend_vm_execute.h:57657
#2 0x00005559a7266930 in zend_execute (op_array=0x7f03fb075000, return_value=<optimized out>) at /usr/src/debug/php-7.4.15/Zend/zend_vm_execute.h:57957
#3 0x00005559a71e1a63 in zend_execute_scripts (type=type@entry=8, retval=0x7f03fb016330, retval@entry=0x0, file_count=file_count@entry=3) at /usr/src/debug/php-7.4.15/Zend/zend.c:1679
#4 0x00005559a717fd00 in php_execute_script (primary_file=primary_file@entry=0x7ffd44c9b940) at /usr/src/debug/php-7.4.15/main/main.c:2621
#5 0x00005559a6fe582b in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/php-7.4.15/sapi/fpm/fpm/fpm_main.c:1939
(gdb)
SELinux audit.log
type=ANOM_ABEND msg=audit(1614726077.349:20549): auid=4294967295 uid=1010 gid=1011 ses=4294967295 subj=system_u:system_r:httpd_t:s0 pid=825 comm="php-fpm" reason="memory violation" sig=11
I'd really appreciate any pointers on what could be wrong or how to debug further.
Our issue was not involving a symlink, but otherwise had similar errors and server configuration.
The segfault appears related to PHP's opcache files (opcache.file_cache) not being properly cleared, or else not syncing with the in-memory RAM opcache.
In our case, the opcache reset was triggered immediately after upgrading to Wordpress 6.3.1 and doing a 'Purge all Caches' with W3 Total Cache 2.4.1.
We do have SELinux enabled, and that's always a possible culprit, but this worked for years with SELinux, so it's unclear if that's contributing or if its logs are just capturing a 'memory violation' from a crash on a file it's monitoring.
Turning off the secondary opcache file-based caching for all sites / PHP-FPM pools by commenting out the opcache.file_cache in each active pool configuration file in /etc/php-fpm.d/. Then, reloading PHP-FPM (and Apache for good measure).
Secondary file-based opcache (via opcache.file_cache) is not necessary for most smaller servers, so this may be an acceptable solution / fix. You lose the benefits of the file caching, which I believe is mainly for quicker startups on server or service restarts.
opcache.file_cache
enabled with a custom pathgrep -r AH01067 /var/log/httpd/* | tail -n20
grep -r SIGSEGV /var/log/php-fpm/* | tail -n20
grep 'memory violation' /var/log/audit/audit.log* | tail -n 20
or grep 'opcache' /var/log/audit/audit.log* | tail -n 20
grep -r opcache /var/log/messages | tail -n20
| tail -n 20
will show only the last 20 instances. You can remove this part to see all results, or change the 20 to some other number.date -d@1693851386
, where 1693851386 is the timestamp value in the log.find / -inum 63377429
, where 63377429 is the inode value in the log.