Script to access multiple linux servers to get system details such as CPU usage

Hi

Is there any shell script that accesses multiple linux servers to get details such as CPU usage, RAM used etc. The access of the servers must be parallel not serial in the sense it must ping all the servers at a time to get information.The script has to be triggered from a host system and get information.The script must be scalable in the sense servers canbe added/removed easily and dynamically without changing much of the script.

Thanks in advance.

what have you tryied so far ?

1) you can setup a config file that old the list of server you want to scan (so you can easy filter out commented line so the commented server are not scanned)

egrep -ve '^#|^$' yourlist.conf | while read a
do
ssh $a "$@" &
done

or something like this

2) The second option could be to display the list of server and prompt for which you want to scan

LIST="serv1 serv2 serv3 serv4 serv5 serv6 serv7 serv8 serv9 serv0"
echo "$LIST \n which server do you want to scan ?"
read ans
for i in $ans
do
ssh $i "$@" &
done

These short piece of code need - of course - to be enhanced to behave as expected but i just provided them to illustrate my thought

Thanks for your reply.

I have searched for such a script and found this:
__http://bash.cyberciti.biz/monitoring/get-system-information-in-html-format/

But i have not verified it. The number of servers involved are around 700. Their number can vary frequently. The problem is that I want to ping all the servers in parallel - at same time not one after the another. The time latency should be 1 sec for collecting information for all 700 servers.

Thanks in advance.

What you need is possible.

See here:

http://www.unix.com/unix-advanced-expert-users/151905-command-run-across-servers.html

I wrote a wrapper for SSH that will run commands on other servers in parallel. It accepts a file, which contains host names.

I don't know about ping'ing 700 in under a second, but in theory it should work. But come to think of it, you wouldn't run the ping command on remote hosts by running commands on the remote hosts.

Anyway, here is an idea:

root@ms:/>gdsh
   Usage: [[-s working_col_file|OS_name] | -w nodename{,nodename}] -c command

   Required Parameters:
         -s or -w and -c

   Optional Parameters:
         [-n]
         [-d] [-h] [-n] [-T] [-v]

   Where:
         -C
            Collapses output. Used with dshbak.
         -c specifies the command to run
            Required.
         -n
            No Parallel mode. By default the command is run on every node at the same time.
         -s
            Specify a working collective file OR specify an OS name. Use uname -s to specify
             a working collective file with that postfix. ie. /.gwcoll.AIX.
         -w nodename{,nodename}
            Default: ALL nodes

   Options that affect only how the script runs:
         -d
            Enable debug mode. Variable contents will be printed. May be specified more than once.
         -h
            Prints this help screen.
         -q
            quiet. Limits output to the essentials.
         -T
            Testing mode. NO actual work will be done.
         -v
            Verbose mode. May be specified more than once.

   Notes:
         One of -s or -w is required.
         Piping the output to dshbak -c will group like results together.
         Use -C with dshbak, or you will get no output.
         -c must be the last flag on the command line.

   Example execution statements:
         gdsh -s /.gwcoll.AIX -c date
         gdsh -s AIX -c date
         gdsh -C -s AIX -c date | dshbak -c
         gdsh -s all.Linux_HP-UX -c date
         gdsh -C -w unxn_sw,wpgux005 -c "date" | dshbak -c

Example run:

gdsh -C -s HP-UX -c "/usr/bin/date" | dshbak -c

HOSTS -------------------------------------------------------------------------
wpgux001_sw, wpgux003_sw, wpgux004_sw, wpgux005_sw, wpgux006_sw, wpgux007_sw, wpgux010_sw, wpgux011_sw
-------------------------------------------------------------------------------
Thu Jan 20 08:29:18 CST 2011


HOSTS -------------------------------------------------------------------------
wpgux002_sw
-------------------------------------------------------------------------------
Thu Jan 20 08:29:19 CST 2011

Thanks a lot for your valuable reply.

I know that 700 servers under 1 second is highly impractical but i wanted a very low latency. Anyways I'll just try your solution.

Thanks again :slight_smile:

ssh doesn't scale in the sense that, if you want to connect to 700 servers at the same time, you have to run 700 separate instances of it.

Have you considered a push instead of a pull system? Put a script on your server that contacts you. Have cron run it repeatedly. The access method could be as simple as a POST request to a CGI script on your machine which stores the data locally, which naturally would be quite scalable.

How about using something like Nagios?

Please deploy some great available NMS/EMS tools like NetXMS, Cacti, OpenNMS, Nagios etc. I mentioned the name in my preference.
You can opt for good economical commercial like OpManager from ManageEngine.