Monitoring Crontabs

Hi So, here is a million doller question. Being a Unix and Linux Admin we all use cront jobs to automate our tasks. But what if we need to monitor the cronjob itself. Oh!!! boy that can be a pain.

Few Techniques:

  1. Redirect the output of the cronjob to a log file.:stuck_out_tongue:
  2. Using MAILTO option to get a e-mail notification for success or failure of a cronjob.:stuck_out_tongue:

But in case of large set up it seems to be a no value idea:eek:. let take an example of a set and than decide what should be the best option::confused:

  1. LAMP technology.
  2. 75 VPS with LAMP.
  3. ALL VPS has PHP scripts in cron to wget a PHP file from the local website.
  4. Almost 10 Cronjobs running per VPS.
  5. Frequency of cronjobs are 1 min. So, each Cronjob execute 24*60 = 1440 times. :eek:

Now both of the above idea would be ideal with this senario. Because going to 75 invidual VPSs and look for the log file will pain to the core.
MAILTO option seems to be promising but getting 1440*10*75 ,ail in a day will become an another mailbomb.

So, My question is how we can monitor these cronjob. is there any tools available which checks these cronjobs and redirect the result to a particular web page. Life would be easier if we have any sort this. One page all the information.

Secondly can we use this MAILTO option to create something like this for us. Like using MAILTO option send the mails to a specific e-mail address which belongs to some ticketing sstem or some other type of unit which generates a report on HTML page or PHP page.

Friends i am not sure how meaningful this will be for you guys but it is actually a pain for me these days. :mad:

Any sort of help or comment be will a great help.:slight_smile:

At this stage, it's still a relatively small environment so cron still sounds viable. But if you have a task running every minute, consider the implications if it starts hanging, you'd blow out cron's lmit on concurrent jobs in just under two hours and all cron jobs would stop.

Might I suggest just having a while - do - sleep60 loop instead? You can use cron to watchdog that script if required too :slight_smile:

But if you are going to want to scale this up at any point, I'd recommend moving away from cron altogether and going to something like control-m or one of the other fine centralised automation tools and save yourself a costly migration later on.

Thanks Dude for replying. Having a dedicated scheduler make sense but somehow i dont find it very helpful in my environment.
See these jobs are very small jobs which last i guess a max 2 - 3 mins. that's why these are scheduled on the localhost only. More over getting a dedicated server as a scheduler which run it from other end and tries to archive the data from the webserver will be a extra effort for network. I have look from the network side as well. Becoz they are running on localhost so, network is not bothered about these jobs.
Secondly i completely agree with you that what if they hung we might end up in real dead zone. But i have already tried to cure that by putting a script which works like a watch dog on these scripts and which prevents them to raise more than what they expected too.

So, in my situation having a tool like control-m will be help, i am not very confident about it.

But my real question was is there something we can do to integrate cronjob with something which is UI and from there we can keep and eye on jobs.

I heards something called "hudson" which keeps tracks of concurrent jobs like cron jobs. but i never used it. Any idea bout this software...or any other software in mind to resolve this issue, or any out of the box integration in mind which can be helpful in this situation.

not sure if I understand the question...

we just make our scripts very good at error checking.
it sends an email if there is an error.
otherwise if just does exit 0, with no output.

as long as it works we don't care about logs and errors and such.

Oops, i am sorry if am not understandable. Question is how can you monitor the cronjob if you have almost 1000 of them, what you have mention is absolutely right. but these are PHP script which directly related to the application it self. Now there could be N number of reasons for the failure of the script.

let me give you an idea of the crontab entries:

12 14 * * * /etc/webmin/cron/tempdelete.pl
*/5 * * * * wget -q -O /dev/null http://localhost/admin/cron.php
0 2 * * * /etc/webmin/fsdump/backup.pl 47211236803641

So, here you can see an example of the crontab entry. So, these PHP scripts are basicall page from teh web site which pull and puts the data inside and outside of DB. they not very log running running scripts.
So, my concern is as i have mentioned there could be N number of reasons that these scripts fails so, any idea, tool by which you can monitor these. That what all script ran today and what all didn't dude what reason.