A script that kills previous instances of itself upon running not killing child processes

I'm likely going to explain this clumsily, so apologies in advance:

I have the following script:

#!/bin/bash

pidPrefix="logGen"


checkPrime ()
{
  if /sbin/ifconfig eth0:0|/bin/grep -wq inet;then isPrime=1;else isPrime=0;fi
}

killScript ()
{
  /usr/bin/find /var/run -name "${pidPrefix}.*.pid" |while read pidFile;do
    if [[ ! "${pidFile}" =~ "${$}" ]];then
      #/bin/kill $(/usr/bin/tail -1 ${pidFile})
      /bin/kill $(/bin/cat ${pidFile})
      /bin/rm ${pidFile}
    fi
  done
}
echo "$$" > /var/run/${pidPrefix}.$$.pid
killScript
#  echo "$$" > /var/run/${pidPrefix}.$$.pid
   tail -F -n0 /opt/REDACTED/logs/manager.log|while read -a  i;do
    if [[ "${i[2]}" == "E" ]];then
      if [[ "${i[7]:0:1}" == "@" ]];then
        echo "${i[0]}|${i[1]}|${i[6]}|${i[@]:8:${#i[@]}}" >> /tmp/allerrors.${i[0]//\//.}.log
      else
        echo "${i[0]}|${i[1]}|${i[6]}|${i[@]:7:${#i[@]}}" >> /tmp/allerrors.${i[0]//\//.}.log
      fi
    fi
  done

The bulk of it you can ignore, the main points are:

  • the function killScript is meant to check for lock files in /var/run (excluding lock files created by the current instance of the script) and kill the pid inside the lock file.
  • a tail is kicked off against a log file, and the output parsed into a separate logfile (if anyone actually cares, I can show sample input for the script, and explain what everything does, but it's not really relevant to my problem).
  • The goal is that this script is run periodically on cron. Every time it's run, the last running instance of the script is killed, and a new one is started

So here's the issue. The script starts a child process, and just killing the first processes pid doesn't kill the child process. It just keeps running and running until manually killed. As seen below:
*

[root@liwmgmt02 utils]# bash new.logGen.sh #kick of an instance of the script

*

[root@liwmgmt02 utils]# ps -ef|grep [n]ew.log #in a separate terminal check pids
  root 11391 7109 0 16:51 pts/1 00:00:00 bash new.logGen.sh
  root 11395 11391 0 16:51 pts/1 00:00:00 bash new.logGen.sh
  • So now the script is running, and there are two pids
[root@liwmgmt02 utils]# bash new.logGen.sh #in separate terminal start a   second instance
  • In the first terminal the script exited back to a prompt and says Terminated
    *
[root@liwmgmt02 utils]# ps -ef|grep [n]ew.log
  root 11395 1 0 16:51 pts/1 00:00:00 bash new.logGen.sh
  root 14623 10569 0 16:52 pts/3 00:00:00 bash new.logGen.sh
  root 14630 14623 0 16:52 pts/3 00:00:00 bash new.logGen.sh
  • and now I have the child process of the script still running and doing stuff.

So yeah, this is my problem. How can I accomplish my goal of making a script that can kill previous instances of itself when it's run if the script spawns child processes?

Also, this script is far from done. There's still error checking to be added, and comments, and general tidying up. But I've reached this point and don't know how to proceed.

Try killing the process ID group with:

/bin/kill -$(/usr/bin/tail -1 ${pidFile})

[LEFT]Note: you are passing a negative number of the parent shell to kill. all child shells should be in the process group of the parent shell.

Also, some implementations of kill require -- to avoid -$PID being interpreted as a signal number eg:

/bin/kill -- $(/usr/bin/tail -1 ${pidFile})

[/LEFT]

Either I'm not understanding your suggestion, or I'm not doing it properly.

[root@liwmgmt02 ~]# ps -ef|grep [n]ew.log
root     13507  7458  0 19:57 pts/1    00:00:00 bash ./new.logGen.sh
root     13514 13507  0 19:57 pts/1    00:00:00 bash ./new.logGen.sh
[root@liwmgmt02 ~]# kill -- 13507
[root@liwmgmt02 ~]# ps -ef|grep [n]ew.log
root     13514     1  0 19:57 pts/1    00:00:00 bash ./new.logGen.sh
[root@liwmgmt02 ~]#

---------- Post updated at 08:15 PM ---------- Previous update was at 07:59 PM ----------

Some one suggested to me using ps --ppid, and I came up with this:

ps --ppid $(cat /var/run/logGen.23223.pid)|while read -a i;do if [[ "${i[0]}" != PID ]];then kill "${i[0]}" > /dev/null;fi;done

though I'm not sure if this is what I'm going to go with. Just a thought for now.

In you example above the kill command should have been:

kill -- -13507

Notice how I have placed a hyphen in front of the process ID

1 Like

Perfect. That works perfectly (in your original post you didn't say the -- required the - infront of the pid as well) . I updated the killScript function to:

killScript ()
{
  /usr/bin/find /var/run -name "${pidPrefix}.*.pid" |while read pidFile;do
    if [[  "${pidFile}" != "/var/run/${pidPrefix}.${$}.pid" ]];then
      /bin/kill -- -$(/bin/cat ${pidFile})
      /bin/rm ${pidFile}
    fi
  done
}

and everything runs as I would expect it to. (I also made checking stricter just to be safe)

Thanks for bearing with me.