Hello, I am trying to write a script which will monitor few processes(winbind) for cpu utilization, If the process consumes more than say 99% cpu for 3 minutes, I want to run a script to restart the service which forks the process.
---------- Post updated at 11:21 AM ---------- Previous update was at 11:12 AM ----------
The Logic I am trying to use is grep for winbind from ps output.
Now add the total of left most column and see if its greater than 99%. If its consistently above 99% for 3 or 5 minutes, then do `/etc/init.d/winbind restart`.
Probably the easiest would be to take a snapshot of the processes periodically over the 5 min period and use the accumulated numbers for you calculation. You might also look at experimenting with sar -X|x.
Depending on what version of sar you have, the -x|X switches are pid specific.
For a process monitor, I'd probably want it to run all the time and then periodically grab the ps output to determine if the process is under|over your thresholds. One large loop with a sleep at the end. Every iteration through the loop would take another snapshot of the state of your processes and make comparisons. You could also store the results of the previous few snapshots calculating averages.
Since you're going through all the work, you could make it more generic than just your specific process (winbind) and pass variables or have a configuration file to monitor any process you like for whatever attribute you like. Have a look at chapter 31 in Expert Shell Scripting for more.