Search code examples
linuxsystemd

Systemd - detect in ExecStopPost whether service exited without error


I have an application that after it's finished and exited normally should not be restarted. After this app has done its business I'd like to shutdown the instance (ec2). I was thinking of doing this using systemd unit files with the options

Restart=on-failure
ExecStopPost=/path/to/script.sh

The script that should run on ExecStopPost:

#!/usr/bin/env bash

# sleep 1; adding sleep didn't help

# this always comes out deactivating
service_status=$(systemctl is-failed app-importer) 

# could also do the other way round and check for failed
if [ $service_status = "inactive" ] 
then
  echo "Service exited normally: $service_status . Shutting down..."
  #shutdown -t 5
else
  echo "Service did not exit normally - $service_status"
fi
exit 0

The problem is that when post stop runs I can't seem to detect whether the service ended normally or not, the status then is deactivating, only after do I know if it enters a failed state or not.


Solution

  • Your problem is that systemd considers the service to be deactivating until the ExecPostStop process finishes. Putting sleeps in doesn't help since it's just going to wait longer. The idea for an ExecPostStop was to clean up anything the service might leave behind, like temp files, UNIX sockets, etc. The service is not done, and ready to start again, until the cleanup is finished. So what systemd is doing does make sense if you look at it that way.

    What you should do is check $SERVICE_RESULT, $EXIT_CODE and/or $EXIT_STATUS in your script, which will tell you how the service stopped. Example:

    #!/bin/sh
    echo running exec post script | logger
    systemctl is-failed foobar.service | logger
    echo $SERVICE_RESULT, $EXIT_CODE and $EXIT_STATUS | logger
    

    When service is allowed to to run to completion:

    Sep 17 05:58:14  systemd[1]: Started foobar.
    Sep 17 05:58:17  root[1663]: foobar service will now exit
    Sep 17 05:58:17  root[1669]: running exec post script
    Sep 17 05:58:17  root[1671]: deactivating
    Sep 17 05:58:17  root[1673]: success, exited and 0
    

    And when the service is stopped before it finishes:

    Sep 17 05:57:22  systemd[1]: Started foobar.
    Sep 17 05:57:24  systemd[1]: Stopping foobar...
    Sep 17 05:57:24  root[1643]: running exec post script
    Sep 17 05:57:24  root[1645]: deactivating
    Sep 17 05:57:24  root[1647]: success, killed and TERM
    Sep 17 05:57:24  systemd[1]: Stopped foobar.