Shell Script for continuously checking status of a another script running in background, and immedia

Hi,

I want to write a script which continuously checking status of a script running in background by nohup command. And if same script is not running then immediately start the script...please help..

i am using below command to run script

nohup system_traps.sh &

but in some reason bsckground script get automatically terminated by super user.

OR is there any option to run a script continously in background without fail or termation.

The problem is likely with system_traps.sh not the fact that you need to monitor it.

As a guess: it has some kind of fatal error, you do not have adequate error reporting in the code to figure out what is going wrong.

Can you post the section of system_traps.sh that fails?

Hi Jim,

Thaks for reply,
I am running this script with my user and kept for running in background overnight but on next day script is exit from its execution and found no error in nohup.out ,so if same scenario happen I want a another script which continuously on checking running status of background script and if script is not running this then it should start script in background.

I understand what you want. What you need is something different, IMO. Something is wrong with the script that ends abnormally. Fix that first. Restarting it constantly is not a valid solution.

Your broken script is not doing what you want and may be doing things you do not want done as a consequence. The reason I'm taking this position: since you do not know how to monitor and check a process, it is very likely that whatever you did in your other script has issues as well. It is like handing a grenade to a kid who asks for one, then he asks 'Which one is the pin I pull?'

We can provide you with a script, no problem. What OS and shell are you using? Do you have crontab access?

Hi Jim,

I am using Linux and yes I have access to crontab.
Meanwhile ,I wrote shell script which will continuously checking status of script running in background and if script is not running then it will start it. now i want to schedule this checker script in crontab for minimal time interval(every second) but crontab allows you to run script for minimum time difference of 1-minute which cause miss of alerts within interval of minute (in case script stopped/killed) because, my script is generating alert from live log file so can�t offered to lose single line not to be getting read. Could you please help me with better option to implement this�

Thanks,
ketanr

while :
do
   sleep 1
   check stuff
done

and nohup that.

But as jim says, you're not solving the underlying problem.

To support Jim's point - a couple of quotes....

Hi Jim,

As you suggested I was tried to find out solution for exiting script running in background but this time script run for two continuous days and then its suddenly stops even I checked with nohup.out for cause I found nothing. Could you please tell me how should i come to know what is the exact cause behind this�please find below code I am using within script.

#!/bin/bash
o=$IFS
IFS=$(echo -en "\n\b")
#date1=`date +'%Y-%m-%d'`
tail -n -0 -F /home/ketan/logs/KETAN-COMMON-ERROR.log | while read myline; do
        ERROR_CODE=$(echo $myline | awk -F ' ' '{for (i=1;i<=NF;i++){if($i ~ "^[1-1]" && length($i)==6){print $i}}}')
        ERROR_CODE_DET=$(echo $myline | awk -F ' ' '{print $1" "$2}')
if [ -n "$ERROR_CODE" ] ; then
                        echo $ERROR_CODE_DET
                        while IFS='|' read -r code disc; do
                        if [ "$ERROR_CODE" -eq "$code" ] ; then
                                echo "sudo snmptrap is triggered with ERROR CODE : $code and ERROR MSG : $disc"
                        fi
                        done < ERROR_CODES.txt
fi
done
IFS=$o

ERROR_CODES.txt contain list of error code and discription saparated by "|
"
120011|Route failed.
Regards,
ketanr

Looks like your tail -F command stops after two days. As you are using bash, you may want to examine the PIPESTATUS array in the script to see what went wrong, or check what happened to the ...ERROR.log file

Hi RudiC,

As suggested I found that input ...ERROR.log file is only get appended when there are logs for specific process is generated else it will remain idle unresponsive for long amount of time, is this a reason why my script getting exited if it�s not received any input from ERROR.log file for long time duration? If yes then could please explain me what is ideal time Linux system keeps process running in background idle and terminate when time exited? ....please help...