Find a process ID,kill it and restart agent

samrat_dutta · August 27, 2015, 4:02pm

#!/bin/bash
#This shell finds the pid of the hawkagent and kills and restarts to put the rulebase into effect
output=`ps aux|grep hawkagent`
#The set -- below helps to parse the above ps output into words and $2 gives the 2nd word which is pid
set -- $output
pid=$2
#Checks if pid of hawkagent exists then kills else starts the hawkagent and exits the program
if [[ $? -eq 0 ]]
then
{
        echo "Kill and start hawkagent if present"
        echo $pid
        kill $pid
        #Wait for 2 seconds for hawkagent to be killed else kill it forcefully if it still exists
        sleep 2
        kill -9 $pid > /dev/null 2>&1
        #Start the hawk agent after killing the existing pid
        nohup ./hawkagent_BWQA &
}
else
{
        echo "Start hawkagent if not present"
        #Start the hawk agent if the hawkagent is not available and exits the program
        nohup ./hawkagent_BWQA &
};
fi
exit

Objctive of the above process is to find a process-id and kill it and then restarts the agent again.
I have below 2 scenarios :
(a) If condition - works when a processID is already present and the process ID is killed and then the agent is restarted using following command :

nohup ./hawkagent_BWQA &

.
(b) Else condition- when the agent is not present then there is no processID. So this else part starts the agent and should exit.
However in this scenario the else part is not working . I believe my below if condition is not working properly

if [[ $? -eq 0 ]]

.
Any thoughts or suggestions is appreciated .

Scrutinizer · August 27, 2015, 4:47pm

Hi,

if [[ $? -eq 0 ]]

will not work since the the previous command:

pid=$2

will always render return code 0

Try this instead:

if [ $# -gt 0 ]

--
You could also use grep's return code:

if output=$(ps -ef | grep "h[a]wkagent")
then
  set -- $output
  pid=$2
  ...
else
  ...
fi

Or if your system contains pgrep, then you can just use:

if pid=$(pgrep hawkagent)
then 
  ...
else
  ...
fi

RudiC · August 27, 2015, 4:54pm

You need to preserve the exit code of output=`ps aux|grep hawkagent` , e.g by adding a temporary variable immediately after the command: TMP=$? and checking that afterwards. Or try sth. like

if pid=$(ps axopid,comm | grep [h]awkagent)
  then echo "Kill and start hawkagent if present"
       kill ${pid% *}
  else echo "Start hawkagent if not present"
  fi
nohup ./hawkagent_BWQA &

samrat_dutta · August 27, 2015, 5:43pm

Thanks .
I executed and it worked. Well when i run my code as ./test.sh i get the desired output but the nohup screen doesn't exit.

[BWQA]$ ./test.sh >> temp.txt
[BWQA]$ nohup: redirecting stderr to stdout

At some point i shall be calling this .sh file from my ant script.

MadeInGermany · August 28, 2015, 2:04am

ps aux and ps -ef print the command arguments, so a grep needs the trick grep "[h]awkagent" .
Note the quotes that prevent the shell from matching against files in the current directory.
pgrep does not need the trick.

RudiC · August 28, 2015, 3:46am

What do you mean?

samrat_dutta · August 28, 2015, 11:32am

Hi I need to execute the entire line from ANT so i have reduced to a single line entry in ant.
How to combine the below two commands in a single line which finds the PID and also starts my hawkagent ?

kill $(ps axopid,comm | grep [h]awkagent | awk '{print $1}')

nohup ./hawkagent_BWQA &

I used the below . The agent is killed but the nohup is not starting the agent. Not sure if am missing something.

kill $(ps axopid,comm | grep [h]awkagent | awk '{print $1}') | nohup ./opt/tibco/tra/domain/BWQA/hawkagent_BWQA > /dev/null &

RudiC · August 28, 2015, 12:30pm

Replace the last pipe with a semicolon, and check the chars at the end of the line.

bakunin · August 28, 2015, 1:27pm

Sorry, but the idea of using "ps" inside (any form of) a script to manipulate processes is flawed from the start.

Processes are allowed to lie about their credentials, except to their parents. For automatically (a script is a sort of automaton) parsing ps output scripts are simply not smart enough. ps is designed for interactive use because systems administrators are supposed to be smarter than scripts.

Having said this: i do not know this hawkagent program, but it has to be started somewhere. This "somewhere" is most probably /etc/inittab . Start it there with the clause respawn and the system will restart it for you every time it stops. The system (to be precise: its init process) will even be safe in doing so because init will be the trueparent of the process.

Everything else is risky at best and terminal to the system at worst.

I hope this helps.

bakunin

PS: if your system has no /etc/inittab get runit or daemontools.

samrat_dutta · August 28, 2015, 2:56pm

is there a way to come out automatically from nohup? Following code remains in a hung state and control is never returned.

nohup ./hawkagent_BWQA &

RudiC · August 28, 2015, 3:27pm

"Control" is not meant to return. By sending the process to background with & you chop it off normal terminal stdin. And, with nohup on top, you remove the HUP signal path from its parent (your interactive process), and the init process becomes its parent.
You'll have to build in extra code for interaction with the outer world, e.g. via FIFOs.

MadeInGermany · September 2, 2015, 5:05pm

I think in some|most cases neither the & nor the nohup chop the shell's input and output.
Therefore I recommend to explicitly redirect the std handles

nohup ./hawkagent_BWQA </dev/null >/dev/null 2>&1 &

RudiC · September 2, 2015, 5:39pm

The phrasing may have been a bit unlucky. If a process sent to background tries to read from the terminal, although still having stdin pointing to the terminal, it assumes a stopped state, so it doesn't interfere with the interactive shell. By sending a process to bg, you intentionally cede interactive control.
You can get back "control" by putting it into foreground with the fg command.