Check/get the exit status of a remote command executed on remote host through script

Geeks,

Could you please help me out in my script and identify the missing piece. I need to check/get the exit status of a remote command executed on remote host through script and send out an email when process/processes is/are not running on any/all server(s).

Here's the complete requirement.

From one server I need to ssh to all servers, check if desired processes are running and send out a consolidated email of all servers that are not running desired processes.

In some cases the server needs to be checked on 3-5 processes, they all need to be on the same script if possible else a different script.

This one works well as expected, when I run on single host from command line.

ssh -tq <ServerOne> "/bin/ps -ef | /bin/grep  'com.utv.wlrs.mer.mer pop start' " > /dev/null

I need this

for i in $Server_List
do
        output=`ssh $Server_List "/bin/ps -ef | /bin/grep -v grep | /bin/grep 'com.utv.wlrs.mer.mer pop start' > /dev/null 2>&1 | wc -l`
        echo "$output"
        if [ "${output}" != "1" ] ; then
                echo "Process is NOT running on $i" >> /var/tmp/failed.txt
                echo "                           " >> /var/tmp/failed.txt
        fi
done
mailx -s "Mer Failures on `date +%F`" $EMAIL_LIST < /var/tmp/failed.txt

Thanks,
Saikrishna

First, note that this script adds notes about failures to the end of existing text in /var/tmp/failed.txt . So, once an error is detected, future invocations of this script will add new failures to the end of the existing list instead of starting with a clean slate each time you run the script. Why output anything at all if the server is running the command you're looking for? Why print a blank line?

Second, you should ssh to a single server (presumably $i ) instead of to your entire list of servers ( $Server_List ) in your loop. Assuming $Server_List expands to more than one word, your current command will always fail with a syntax error.

Third, you aways send mail even if no errors were detected.

Fourth, if you are trying to count the number of lines in ps output that match a certain string, why are you throwing away the output before counting the number of lines found?

Fifth, the output from wc -l contains some leading spaces (unless the number of lines counted is 10 million or larger); so a string comparison between the output from wc -l and "1" can never match.

And, finally, if the command:

ssh -tq <ServerOne> "/bin/ps -ef | /bin/grep  'com.utv.wlrs.mer.mer pop start' " > /dev/null

works to check the status of a single server, why are you using a different command to check the status of each server in your loop? What is it about the command above that tells you whether or not the command you're looking for is running on that server? Why can't you script determine whether or not the command you're looking for is running on any server when you run it inside your for loop?

If the output of grep is redirected to /dev/null it never gets piped to wc -l
There's also a mismatched "

Hello Don,

We only need to report in case of failures, if the process is running as expected then no need to report.

For all failures in servers list it should append to the file /var/tmp/failed.txt so that we can send out all in 1 email.

List of servers are different, so processes running on them differ. I can append wc -l at the end of script for string comparison if that can address the issue.

Due to large volume of servers its hard to manage/insert script on all servers, we need one host to check for errors and report through email on exact failure, so if new servers are added to list along with new processes then we can just edit/update script in 1 place rather than 1000+.

I will update the script to send email only incase of failures or /var/tmp/failed.txt is found.

Thanks,
Saikrishna

Hello Don/All,

I fixed the issue with while loop and awk.


while read line ; do awk '{if ($2 ~ /^1$/) print $0}' ; done < default

Thanks,
Saikrishna

It has been quite a few days since you introduced your post and I do not remember all the details of it, but that could have just been done as:

awk '$2==1' default

alone.