Performance Monitoring script for UNIX servers

ssk250 · March 19, 2014, 4:34am

Hi,

I have been working on writing an automated script that will run 24x7 to monitor the performance parameters like CPU,Memory,Disk I/O,Network,SWAP Space etc for all types of Unix servers ( HP-UX,AIX,SOLARIS,LINUX).

Problem is I am confused with the commands top,prstat,vmstat,free,sar etc.

Can anyone help me to understand which is the best command that provides info about CPU and Memory for all or individually(Solaris,AIX,Linux,HP-UX).

Also please share any of the scripts you have for monitoring the performance of Unix servers and any other parameters that are needed to monitor to judge the performance of an Application or Database Server

Thanks in advance....

Regards
SSK250 :)

jim_mcnamara · March 19, 2014, 7:27am

As a generalization, most basic commands will work on most UNIX flavors. POSIX does not have much to say about what and how system and system management commands look like or how they behave.

Try what I think is the original UNIX Rosetta stone:

Rosetta Stone for Unix

Then you may reduce the scope your project. Plus you will need a running version of each to unit test.

rbatte1 · March 19, 2014, 7:54am

Make sure that you read the manual pages on each system even if the command names are the same. You may get slight variations in output column names, positions etc., so you might need to create a flexible script that you can adjust easily, e.g. if you have a command output to:-

...... | read col1 col2 col3 col4 col5 rest
do
   echo "The val I want is $col5"
done

.... then you need to be careful that you always want column 5.

I hope that this helps you avoid some surprises.

Robin

Chubler_XL · March 19, 2014, 11:31pm

Have you considered using a SNMP deamon and check for alert conditions using that.

There are already quite a few commercial status monitoring tools around (Nagios/SolarWinds/Cacti come to mind) that can send SMS/email etc. alerts whenever a number of conditions arise, monitor and graph trends and the like.

MadeInGermany · March 21, 2014, 9:26am

Most people monitor RAM and SWAP, but the resulting figures are rather of academic interest, and alarm thresholds are even problematic.
More relevant is the sum of them, virtual memory (let's call it VMEM).
Here is a quick-and-dirty implementation:

case `uname` in
Linux)
  # deduct some cached data because it is easily reclaimable
  used=`free | awk '
/^[Mm]em/ {used+=$3; eused+=$3-($6+$7)/f; free+=$4}
/^[Ss]wap/ {used+=$3; eused+=$3; free+=$4}
END {print int(eused*100/(used+free))}
' f=2`
  ;;
SunOS)
  used=`swap -s | nawk '{print int($9*100/($11+$9))}'`
  ;;
esac