performance script

jlk · May 1, 2009, 7:37am

Hi guys, I am trying to create a script that will email me when ram or cpu usage gets above a certain level.

I am trying to use grep and awk on the top command but the list keeps changing. Could anyone give me any pointers on what is best to use for this kind of thing.

Thanks

Jon

funksen · May 1, 2009, 8:19am

use vmstat

vmstat 900 2 | tail -1

mean from 15 minutes

process with read
eq.

vmstat 900 2 | tail -1 | read r b swpd free buff cache si so ....


if [[ $id -lt value || $free -lt value || $..... ]]
then 
echo "cpuid $id, memfree $free" | mail -s subject emailaddress 
fi

for example

let it run from cron,

or use vmstat 900 | while read ....

kodak · May 1, 2009, 10:14am

I'd just like to mention that if you have more than one machine that you'd like to monitor, you'd be much better served by installing a monitoring package.

There are lots and lots of them out there, from the very simple munin types to Nagios, both free and commercial.

If it's a single machine then shell scripts are fine for this sort of thing, but even they can get unwieldy if you try to monitor too much. Most monitoring packages will have notification built in.

jlk · May 1, 2009, 11:58am

Thanks for that funksen!
Kodak: Ive seen alot of monitoring packages but i thought i might give a shell script a chance before i do. Are there any specific packages you may recommend?

thanks

kodak · May 1, 2009, 4:27pm

It all depends on what you want to do.

Munin is dead simple to set up, but its capabilities by itself are pretty slim. Nagios is very capable, but more complex to configure, and doesn't graph out of the box (there are plugins that do so.)

I wish I could tell you "use this" but I can't. Sorry.

pludi · May 1, 2009, 4:54pm

Munin: simple, easy to set up, but due to how RRDtool works you'll loose accuracy on older measurements. Also you'll have to hack it if you want anything but the default 5 minute resolution, you'll have to manually correlate graphs, and most plugins are written for Linux (extensive use of /proc and iptables)

Nagios: complex, but very powerful.

Zabbix: (no personal experience, but Neo mentioned it once) Also, very powerful, and seems to support graphing and reports out of the box.

MarkSeger · May 2, 2009, 8:20am

And then there's collectl - see: collectl

It can do most anything from simple interactive monitoring of a few 'subsystems' like cpu, disk, etc and/or writing the data to disk and letting you play it back later. For more advanced topics you can have it send its data to higher level monitoring tools like ganglia.

It even has a --vmstat switch which means you can play back old data in vmstat format with timestamps, just one of many options.

The key is collectl is very lightweight, using less than 0.5% CPU when sampling everything every 10 seconds. Most users simply enable it as a daemon and keep it running forever. Then if you have a problem, the data you need has probably already been collected for you.

-mark

kodak · May 5, 2009, 2:19pm

Collectl looks like a neat tool, but it appears to be Linux centric, which is unfortunate for shops with other types of *nix.

MarkSeger · May 13, 2009, 7:11am

This may be overkill and I'm not even sure if it would work on other *nix systems until someone tries it, but...

You could always write your own data importation modules - it would be a bunch of work. But then collectl could report it,file it, send it over sockets, etc.

So if anyone is up for trying and wants some help I'd be happy to answer any questions about it, but as I said it's probably a lot of work of dubious value.

-mark

jlk · May 13, 2009, 8:43am

Thankyou for all the info so far, this forum is very friendly.
All i really need to do is monitor cpu usage, free memory and disk space, one of my servers ran out of space a few months ago due to a crontab that was making reports so i just need it to monitor the basics for now until i move on to bigger servers and more apps.