Need some help in bash scripting with ssh

Hi @ all

I have the following scenario:

As Admin of a cupple of servers I tried to write the following script to figure out, if the machine is up and available and if some directory�s were available. But my script is having some probs, while running. Maybe some of you have a better way to solve the same issue:

#!/bin/bash
while read server
do
if [ ! -z $server ]
then
        ping -c 1 $server
        if [ $? -eq 0 ] ; then
                echo -e "Das System "$server" ist per Ping erreichbar und wird jetzt abgefragt!"
                echo -e "Das System "$server" ist erreichbar und steht zur Verfuegung" >> erreichbare-server-log.txt
                WLS=$(ssh -f -x -n $server 'ls -d /opt/wls*[0-9] 2>/dev/null|wc -l')
                OHS=$(ssh -f -x -n $server 'ls /opt/*/*/Apache/Apache/logs/httpd.pid 2>/dev/null|wc -l')
                WAS=$(ssh -f -x -n $server 'ls -d /opt/IBM/WebS*[A-Za-z0-9] 2>/dev/null|wc -l')
                IHS=$(ssh -f -x -n $server 'ps -ef | grep -i HTTPServer | grep root | grep start |grep -v grep 2>/dev/null|wc -l')
                Apache=$(ssh -f -x -n $server 'ls -d /usr/local/apache/bin/apache*[A-Za-z0-9] 2>/dev/null|wc -l')
                JBoss=$(ssh -f -x -n $server 'ps -ef |grep -i org.jboss.Main 2>/dev/null|grep -v grep|wc -l')
                TomCat=$(ssh -f -x -n $server 'ps -ef |grep -i catalina.home 2>/dev/null|grep -v grep|wc -l')
                MySQL=$(ssh -f -x -n $server 'ps -ef |grep -i mysql 2>/dev/null|grep -v grep|wc -l')
                echo -e "$server; \t\t $WLS; \t $OHS; \t $WAS; \t $IHS; \t\t $Apache; \t \t $JBoss; \t \t $TomCat; \t \t $MySQL" >> servers2.csv
        else
                echo "Maschine ist nicht erreichbar"
                echo -e "Das System "$server" ist aktuell nicht erreichbar!" >> nicht-erreichbare-server-log.txt
                set -e

        fi
else
        echo "IP Address is empty"
fi
done < serverip.txt
cat servers2.csv | awk -F';''BEGIN(WLS=0;OHS=0; WAS=0; IHS=0; Apache=0; JBoss=0; TomCat=0; MySQL=0}{WLS+=$2;OHS+=$3;WAS+=$4;IHS+=$5;Apache+=$6;JBoss+=$7;TomCat+=$8;MySQL+=$9}END{printf("Gesamtanzahl; \t \t %d; \t %d; \t %d; \t %d; \t\t %d; \t\t %d; \t \t %d; \t \t %d\n", WLS, OHS, WAS, IHS, Apache, JBoss, TomCat, MySQL)}' >> servers2.csv

echo -e "Dieses Script wurde ausgefuehrt: " DATE=`/bin/date +%d-%m-%y_time_%H-%M-%S` >> servers2.csv
Time() >> servers2.csv

echo -e "FERTIG"

The list of servers were stored in a seperated file where from I get the IP�s. Any suggestions are welcomed

What are the problems?

While running the while loop, something goes wrong, and the workstation hangs up. I don�t know why or exactly where the error is occuring. Separeted, everything is working fine. Maybe something wrong in quotations? but Im not that sure...:frowning:

When I run the script, I get a freeze at the following point:

./ipaddr.sh
PING 10.2.20.114 (10.2.20.114) 56(84) bytes of data.
64 bytes from 10.2.20.114: icmp_seq=1 ttl=64 time=1.34 ms

--- 10.2.20.114 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.347/1.347/1.347/0.000 ms
Machine is giving ping response
Das System 10.2.20.114 ist per Ping erreichbar und wird jetzt abgefragt!
PING 10.253.8.250 56(84) bytes of data.
64 bytes from 10.253.8.250: icmp_seq=1 ttl=53 time=6.63 ms

--- 10.2.20.116 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 6.635/6.635/6.635/0.000 ms
Machine is giving ping response
Das System 10.2.20.116 ist per Ping erreichbar und wird jetzt abgefragt!


After 30 seconds the script halts on the following error:

ssh: connect to host 10.2.20.116 port 22: Connection timed out

But ssh to this host is definitly possible and working from bash. So there must be somewhere an error, that I actually don�t see!!!

Any one of the commands containing asterisks could expand to a line which is too long for the remote system's shell or produce some invalid syntax. Maybe try a diagnosic echo of each command to see whether one or more of them is failing on that particular remote system.

As a minimum, try placing a diagnostic echo after each ssh command in the master script so you can work out which command is failing on that particular server.

Are all the remote servers running the same Operating System as the master server?

Personally I would not issue one ssh per command. I would write a proper robust script and place it on each remote system.

Thx for modifying my mistake. I haven�t seen the code tag earlier...Sorry for that.
I think I found out one error, but still don�t work as expected. As I put an echo behind every ssh line, and tried to test it, I found out, that it is blocking some ssh connections by my user because I not allowed to jump on every machine. Ok, that was my fault. But now the ping is working fine, and I get a nice reply back from the machines. Also the input into the files were fine. But the script doesn�t jump into the else part, if a ping isn�t possible or a ssh connection isn�t possible. Do I have to change this part? Any suggestions?

And placing the script locally on the machines isn�t that easy, because there were over 1700 machines, most of them virtualized and maintained by our customers. So I �m having not the root access, I would need for that kind of operations Im planning to do.

And not all machines are running the same OS. Different distrbutions on different machines. This depends on customer needs.

Thx in advance for your help! Sometimes I can�t see the forest standing in front of the trees :slight_smile:

check this

...
cat servers2.csv | awk -F';''BEGIN(WLS=0;OHS=0; WAS=0; IHS=0; Apache=0; JBoss=0; TomCat=0; MySQL=0}{.......}'

try to change with

awk -F';' 'BEGIN{....

thx for the help guys, but still not working. I�m running against a wall with this one, however I try to modify :frowning:

I think the problem is in the ssh request happening. How could I switch if an error occurs out of the while loop? Do I need to insert another if ... then querry or is another while loop better to get the effect if an error in the connection occurs or during transition of commands to jump out of the first while loop?
A colleague of mine told me to try to simplify the ssh requests and to merge them together. But Im not sure, if this will solve my problems! What do you think about this?

Another error I�m receiving is, if an machine is up and ping'able, but the time to live delivers an error, is it possible to set a filter, to get this machines seperated without awk and sed?
Here is what I mean:

PING 172.16.15.10 (172.16.15.10) 56(84) bytes of data.
From 10.128.16.155: icmp_seq=1 Time to live exceeded

My script runs till this message, and then breaks without an error message. The rest of the ips won�t be considered further. How could I involve this into my script? Any proposals?

Remove the line set -e . Not sure what you wanted it to do, but it changes the way the Shell works such that the script ends if a command produces a non-zero response.

You can remove the overhead of many ssh session with something of a here-doc to send the script to the remote shell.

while read server; do
        if [[ -z $server ]]; then
                echo "IP Address is empty"
                continue
        fi
        if ! ping -c 1 "$server"; then
                echo "Maschine ist nicht erreichbar"
                echo "Das System $server ist aktuell nicht erreichbar!" >> nicht-erreichbare-server-log.txt
                continue
        fi
        echo "Das System $server ist per Ping erreichbar und wird jetzt abgefragt!"
        echo "Das System $server ist erreichbar und steht zur Verfuegung" >> erreichbare-server-log.txt
        ssh -x "$server" /bin/bash -s "$server" << 'EOF'
WLS=$(ls -d /opt/wls*[0-9] 2>/dev/null|wc -l)
OHS=$(ls /opt/*/*/Apache/Apache/logs/httpd.pid 2>/dev/null|wc -l)
WAS=$(ls -d /opt/IBM/WebS*[A-Za-z0-9] 2>/dev/null|wc -l)
IHS=$(ps -ef | grep -i HTTPServer | grep -e [r]oot -e start 2>/dev/null|wc -l)
Apache=$(ls -d /usr/local/apache/bin/apache*[A-Za-z0-9] 2>/dev/null|wc -l)
JBoss=$(ps -ef |grep -i '[o]rg.jboss.Main' 2>/dev/null|wc -l)
TomCat=$(ps -ef |grep -i '[c]atalina.home' 2>/dev/null|wc -l)
MySQL=$(ps -ef |grep -i '[m]ysql' 2>/dev/null|wc -l)
echo -e "$1; \t\t $WLS; \t $OHS; \t $WAS; \t $IHS; \t\t $Apache; \t \t $JBoss; \t \t $TomCat; \t \t $MySQL"
EOF
done < iplist.txt

Really great work. I have changed some poor things and set some optimizing on the script, but the main thing is working as expected.
Thank you neutronscott for spending the time!

As I already said, sometimes you don�t see the forest in front of the trees :wall:
I really do appreciate it for your help!