Search code examples
node.jsexpresspm2

PM2 doesn't fully run signal event handler


I have a graceful shut down function in my Express application to handle receipt of signals. If I run the application not using PM2, the function works as expected. When I start the application in cluster mode using PM2, the function only executes the first line before dying.

As far as I can see in the logs SIGKILL isn't being sent, but it's also not fully executing the event handler callback before dying. I have been unable to determine why the event handler doesn't finish, but I have been able to figure out that this issue is unique to PM2. I tried updating the kill timeout, I tried using no-killtree, nothing has worked.

The relevant pm2.log section:

2022-11-09T15:23:28: PM2 log: App [server:7] starting in -cluster mode-
2022-11-09T15:23:28: PM2 log: App [server:7] online
2022-11-09T15:23:28: PM2 log: App [server:8] starting in -cluster mode-
2022-11-09T15:23:28: PM2 log: App [server:8] online
2022-11-09T15:23:28: PM2 log: App [server:9] starting in -cluster mode-
2022-11-09T15:23:28: PM2 log: App [server:9] online
2022-11-09T15:23:28: PM2 log: App [server:10] starting in -cluster mode-
2022-11-09T15:23:28: PM2 log: App [server:10] online
2022-11-09T15:39:26: PM2 log: Stopping app:server id:7
2022-11-09T15:39:26: PM2 log: Stopping app:server id:8
2022-11-09T15:39:27: PM2 log: App name:server id:8 disconnected
2022-11-09T15:39:27: PM2 log: App [server:8] exited with code [0] via signal [SIGINT]
2022-11-09T15:39:27: PM2 log: App name:server id:7 disconnected
2022-11-09T15:39:27: PM2 log: App [server:7] exited with code [0] via signal [SIGINT]
2022-11-09T15:39:27: PM2 log: pid=50526 msg=process killed
2022-11-09T15:39:27: PM2 log: pid=50533 msg=process killed
2022-11-09T15:39:27: PM2 log: Stopping app:server id:9
2022-11-09T15:39:27: PM2 log: Stopping app:server id:10
2022-11-09T15:39:27: PM2 log: App name:server id:9 disconnected
2022-11-09T15:39:27: PM2 log: App [server:9] exited with code [0] via signal [SIGINT]
2022-11-09T15:39:27: PM2 log: App name:server id:10 disconnected
2022-11-09T15:39:27: PM2 log: App [server:10] exited with code [0] via signal [SIGINT]
2022-11-09T15:39:27: PM2 log: pid=50540 msg=process killed
2022-11-09T15:39:27: PM2 log: pid=50547 msg=process killed

The event handler and process calls:

const onSignal = (signal) => {
        console.log('test');
        console.log(`${signal} signal received. Cleaning up and shutting down.`);
        console.log('test');

        console.log("Closing HTTPS server to new connections");

        httpsServer.close(error => {
                if (error) {
                        console.log(`Error closing HTTPS server: ${error.message}`);
                        console.log("Process cleanup did not occur due to error.");
                        process.exit(1);
                }
        });

        console.log("Closing SQL Connection Pool");

        db.end().then(() => {
                console.log("SQL Connection Pool successfully ended");
        })
        .catch(reason => {
                console.error(`Failed to close SQL Connection Pool. Error: ${reason.message}`);
        })
        .finally(() => {
                process.exit(0);
        });
};

process.on('SIGINT', onSignal);
process.on('SIGQUIT', onSignal);
process.on('SIGTERM', onSignal);

The only thing that shows up in the application log is the first console log, once for each pid of the cluster. I don't understand why the process is dying before the graceful shut down can finish. If it was a kill timeout issue, then when I set the kill timeout to one minute that should've been enough time, but it still killed the pid within seconds. I'm at a loss here what is causing this behavior.


Solution

  • I solved this by eliminating most of the logging. There seems to be a bug in PM2 cluster mode that affects all loggers. In cluster mode specifically PM2 only logs one statement then ignores all other log statements. The function still executes properly, but the logging does not happen.

    Essentially I only log one statement at any point of finality that describes how the function finished. I don't consider this an answer, but more of a jimmy rigged solution to get around a bug.

    Reference point for this bug: https://github.com/Unitech/pm2/issues/3691

    const onSignal = (signal) => {
        httpsServer.close(error => {
            if (error) {
                logger.error(`Signal ${signal} was received and there was an error closing HTTPS server: ${error.message} Graceful shut down process did not finish due to error.`);
                //exit with error
                process.exit(1);
            }
        });
    
        db.end().then(() => {
            logger.info(`Signal ${signal} was received and graceful shut down process finished.`);
        })
        .catch(reason => {
            logger.error(`Signal ${signal} was received and there was an error closing the SQL Connection Pool. Error: ${reason.message}`);
        })
        .finally(() => {
            process.exit(0);
        });
    };