Hi friends,
I have one unix command which is used to check the network status manually.
followig is the command
check_Network
this command give follwoing status
Network 1 is ok
Network 2 is ok
network 3 is ok
network 4 is ok
.
.
.
.
Network 10 is ok
Sometimes, command
check_Network
does not give any output and hanges at that moment. I need to press ctrl+c to come back to server command prompt
After that i restart network handler process by command
restart ntework_all
After that command
check_Network
gives correct output as above.
Now, I want to automate this task by writting shell script.
1) Script should check netwok status ater regular interval (lets say 20 mins)
2) if command
check_Network
dont give any output and hanged. Then it should come out from hanged state (Similler to pressing ctrl+c) and restart network handler processes by
restart ntework_all
. and check network status again
3) After restart it should send mail to me saying "Network Handler has been restarted at <Time> "
This might get you started - this version ignores any output from check_Network and only restarts if check_Network times out.
You could redirect the output of check_Network
to a file and process the file at the bottom of this script checking for other conditions that require a restart.
I'd suggest running this script every 20min from cron rather that having it sleep and loop all the time.
TIMEOUT_SECS=20
command_timeout () {
[ -d /proc/$check_pid ] || exit
kill $check_pid
wait $check_pid 2> /dev/null
restart ntework_all
echo "Network Handler has been restarted at $(date)" | mail -s "Network Handler" nakul_sh@mail.unix.com
}
check_Network > /dev/null 2>&1 &
check_pid=$!
parent_pid=$$
# Setup alarm for TIMEOUT_SECS that calls command_timeout
trap command_timeout SIGALRM
(sleep $TIMEOUT_SECS; kill -ALRM $parent_pid ) &
alarm_pid=$!
#wait for check_Network to finish
wait $check_pid 2> /dev/null
# We are back so cancel alarm
[ -d /proc/$alarm_pid ] && kill $alarm_pid
3 Likes
I run this script and seems that there are some issues with PID when it attempt to kill. It remain hanged after below output. I need to press ctrl+c to come out to command prompt.
Unix_buzz:>sh test123.sh
logout
Unix_buzz:>kill: 8182: The specified process does not exist.
But it seems that, timout factor is working fine. Because after 20 seconds,it immediately try to attempt the kill <PID>
I appreciate your help in this matter.
My guess is you pressed ctrl+c before the 20 second timeout. If 20 seconds is too long (ie your inclined to press ctrl+c earlier that than) reduce the time, perhaps 5 seconds is a better fit for this command.
No. i dont pressed ctrl+c before 20 seconds.. Seems that there are some issues with PID only.
is it possible to use any other method to get PID,instaead of using $! and $$ signs.
No. What shell are you running this in?
Of course this check_Network is a black box and could be spawning multiple other background processes, and it's one of those that's getting stuck.
When you manually clean up how do you find the PIDs ? Can you paste a ps listing of the stuck processes.
I am using ksh.
when there is problem in checking network status i get following blank output
after that i press ctrl+c on keyboard. to come-out to command prompt
Unix_buzz:>check_network
Unix_buzz:>
I mean to say, when check_network is hanged, i am unable to to anything untill i press ctrlc+c.
As previously stated, check_network is a blank box and we don't know what's inside it. Could you tell us what's inside it? Or is it a binary app?