Force systemd/ExecStop script to cleanup processes

Running RHEL7.X with systemd 219 and I have a systemd service called "myService" with the following settings:

ExecStart=/bin/bash /asp_APPL/myInstance/script/myServiceSystemdCtrl.sh start
ExecStop=/bin/bash /asp_APPL/myInstance/script/myServiceSystemdCtrl.sh stop
KillMode=none
Restart=no
Type=notify
NotifyAccess=all
###Can't use ExitType until Systemd 250 release ( systemctl --version )
#ExitType=cgroup

When I call the ExecStart script, it runs a number of different interdependent processes and I send systemd a NOTIFY update with MAINPID= from my startup script with a single PID to monitor what I consider the most important, and the systemctl status looks like this:

● myService@myInstance.service - "myService Instance : myInstance"
   Loaded: loaded (/etc/systemd/system/myService@.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2022-06-16 15:19:40 BST; 23h ago
  Process: 9700 ExecStop=/bin/bash /asp_APPL/myInstance/script/myServiceSystemdCtrl.sh stop (code=exited, status=0/SUCCESS)
 Main PID: 6099 (myMainProcess)

Sometimes this MAINPID has failure and exits, but the app itself has a self-healing aspect and usually recovers, which is why we use this in our /etc/systemd/system/myService.service definition file

KillMode=none
Restart=no

We get the following systemctl status in this "failed" status scenario, but its not major consequence because of self-healing mentioned above and we can live with this as an incorrect status for a time

● myService@myInstance.service - "myService Instance : myInstance"
   Loaded: loaded (/etc/systemd/system/myService@.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Fri 2022-06-17 14:58:44 BST; 1s ago
  Process: 9700 ExecStop=/bin/bash /asp_APPL/myInstance/script/myServiceSystemdCtrl.sh stop (code=exited, status=0/SUCCESS)
  Process: 6099 ExecStart=/bin/bash /asp_APPL/myInstance/script/myServiceSystemdCtrl.sh start (code=exited, status=1/FAILURE)
 Main PID: 6099 (code=exited, status=1/FAILURE)

However sometimes the app can't recover and we want to do a "clean" shutdown of all services with requisite cleanups, which our ExecStop script does quite nicely.

However, when I run this, nothing seems to run,

   systemctl stop myService@myInstance

However, when I run this, I can see the startup msgs from ExecStart immediately.

   systemctl restart myService@myInstance

So to me, when you are in a "failed state", ExecStop script never runs.

I even tried running the following, but it didnt work either with ExecStop:

   systemctl reset-failed myService@myInstance

● myService@myInstance.service - "myService Instance : myInstance"
   Loaded: loaded (/etc/systemd/system/sierra@.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Fri 2022-06-17 15:30:16 BST; 8min ago
  Process: 9700 ExecStop=/bin/bash /asp_APPL/myInstance/script/myServiceSystemdCtrl.sh stop (code=exited, status=0/SUCCESS)
  Process: 27952 ExecStart=/bin/bash /asp_APPL/myInstance/script/myServiceSystemdCtrl.sh start (code=killed, signal=KILL)
 Main PID: 27952 (code=killed, signal=KILL)

Is there anyway I can force the ExecStop script to run from systemctl commands/context only. In know I can run aspects of the ExecStop script manually for the cleanups, although given all the internal systemd semantics in the script, it would be hard to de-couple.

Hello,

I think this might be a case for the ExecStopPost option. According to the man page for systemd.service:

       ExecStopPost=
           Additional commands that are executed after the service was stopped. This
           includes cases where the commands configured in ExecStop= were used,
           where the service does not have any ExecStop= defined, or where the service
           exited unexpectedly. This argument takes multiple command lines, following
           the same scheme as described for ExecStart. Use of these settings is optional.
           Specifier and environment variable substitution is supported.

What this would mean is that, in the event of a service failure, the commands given for ExecStopPost would be executed. So when whatever event causes the service to enter a failed state occurs, the commands in ExecStopPost would be run at that point in time, even if the service was not stopped via systemctl (which is normally the only thing that would cause the ExecStop commands to be run). So this would ensure that your cleanup commands could run whether the service was manually stopped, or if it crashed or failed for some other reason, since ExecStopPost would run in either case.

I realise this isn't quite what you asked for (particularly since you say your service seems to systemd to enter a failed state when, in reality, it has not failed), so this still may not give you quite what you need. However I'd suggest that the "real" fix here, so to speak, might be to get the service to be fully compliant with systemd (i.e. so that when systemd believes it has failed, it really has truly failed) which would make auto-cleanup via ExecStopPost a viable option.

Hope at least one of these options helps ! If not, or if you have any further questions, please let us know and we can take things from there.

1 Like