Continueing relyability problems with SCO 6.0 on Dell Poweredge 1800 with PERC DC4.

Hello all,

I'am experiencing weird relyability problems with a SCO 6.0 Openserver server that runs on a Dell Poweredge 1800 equipped with a PERC DC4 raid controller and 4 36 GB 15 K rpm hot swappable scsi harddisks.
It runs a RAID 10 configuration.
It uses the mega hba, Revision 8.03a Release Date: 03 May 2005 driver for the PERC DC4 raid controller.
The machine runs as a breeze, very fast access to the filesystem but the problem is that it can run without a problem for months, e.g. 3 months but suddenly freezes then. I have to say that the problem becomes worse after the first freeze after a fresh installation.
I've had a script on it that every 10 seconds put the time + the output of a w and a ps -ef to the harddisk.
The script has created the last file before the system froze, but not written the content to it.
The file 10 seconds earlyer is complete.
Due to that observation it seems to me that at the moment the system freezes the filesystem (Raid?) stops.
I've been thinking to return to a single SCSI disk with this system or to go to another SCO6.0 certified RAID controller.
We've tested the RAID controller extensively in a windows environment, and the diagnostic tools that come with the Dell system don't find any errors in this system during tests.
I think that its due to the SCO driver that we've the problems, but I don't know for sure. Any help on this one?

W. Kind Regards,

Frederik

Hi All,

I've added the KHZ=100 parameter to /stand/boot, and fsck-ed all filesystems as suggested in this newsgroup:
OpenServer 6.0.0 hangs randomly. Please help! - comp.unix.sco.misc | Google Groups
They described a simular problem with a HP Proliant G5 there (seems to me a very rugged machine, can't imagine that such hardware easily causes problems)
The machine is put into the rack again and turned on, and I hope this cured my problem.
Any way if there are more suggestions they are welcome.

Frederik