cron jobs not running during certain window

Hi unix.com,

I'm currently experiencing a problem on one of our servers at work where between the hours of approx. 1:00 AM and 2:[2-4]0 AM no jobs scheduled through cron run.

Now, in the cron logs I do see that all jobs are launched and I'm not seeing any errors anywhere - but - none of the jobs do what they're supposed to do and no output is generated in the job specific logs.

Our Linux admin setup a job which simply spits out the date into a file every 5 minutes and as expected, the cron log shows that the job runs, but there is no date in output file for his test job between 1 and 2:40 AM.

1,6,11,16,21,26,31,36,41,46,51,56 * * * * date >>/tmp/data.log
from cron log:

...
Oct  5 01:01:01 host crond[27190]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:06:14 host crond[30807]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:11:02 host crond[31559]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:16:01 host crond[31956]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:21:01 host crond[32390]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:26:01 host crond[324]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:31:01 host crond[718]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:36:01 host crond[1152]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:41:13 host crond[1528]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:46:02 host crond[1902]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:51:15 host crond[2384]: (user) CMD (date >>/tmp/data.log)
Oct  5 01:56:08 host crond[2779]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:01:01 host crond[3161]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:06:03 host crond[3690]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:11:22 host crond[4222]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:16:36 host crond[4695]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:21:02 host crond[5227]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:26:01 host crond[5667]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:31:01 host crond[6069]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:36:01 host crond[6708]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:41:01 host crond[10208]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:46:01 host crond[21524]: (user) CMD (date >>/tmp/data.log)
Oct  5 02:51:01 host crond[23941]: (user) CMD (date >>/tmp/data.log)
...
from /tmp/data.log:

...
Wed Oct  5 00:46:01 EDT 2011
Wed Oct  5 00:51:01 EDT 2011
Wed Oct  5 00:56:01 EDT 2011
Wed Oct  5 01:01:01 EDT 2011
Wed Oct  5 02:40:28 EDT 2011
Wed Oct  5 02:40:40 EDT 2011
Wed Oct  5 02:40:46 EDT 2011
Wed Oct  5 02:40:48 EDT 2011
...

This problem appears to affect all cron jobs.

The server is running RHEL release 4 Nahant Update 8

[user@host/]$ lsb_release -a
LSB Version:    :core-3.0-ia32:core-3.0-noarch:graphics-3.0-ia32:graphics-3.0-noarch
Distributor ID: RedHatEnterpriseAS
Description:    Red Hat Enterprise Linux AS release 4 (Nahant Update 8)
Release:        4
Codename:       NahantUpdate8

Possibly worth mentioning the server hasn't been rebooted in >1100 days.

[user@host/]$ uptime
 12:34:03 up 1100 days, 22:15,  6 users,  load average: 15.23, 12.93, 11.73

Apparently this has been happening for a long time (>1 year?) but was never reported.

I'll be signing in to this box around 1 AM tonight to see if anything peculiar is happening.

Hope that's enough info to get started.

Really appreciate any suggestions.

While your logging in at that time see if you can run the job manually too if this works.

And you might want to set 2> to a file to see if any error output is going on.

It appears that the system is nearly grinding to a halt during this window which is why the jobs don't run.

Now to find out why...