Hi,
We have a filesystem mystery on our hands. Given:
- 2 machines, A and Aa.
- Machine Aa is the problem machine.
- Machine A is running Ubuntu, kernel 2.6.22.9 #1 SMP Wed Feb 20 08:46:16 CST 2008 x86_64 GNU/Linux. Machine Aa is running RHEL5.3, kernel 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux.
- Both are running the anticipatory i/o scheduler.
- Both are running two software applications, x and y
- x reads from the network and writes to disk.
- y reads from x' file, filters, and writes annotated data about x
- Both x and y perform their work in partition W. They are the only applications that open files in that partition. I have used lsof and also shut down the applications and immediately unmounted the partition (successfully), so I know those are the only two things with files open in there.
The problem:
- On machine A, our disk utilization for the device whereupon partition W is mounted is very low. Interactive response is very good.
- On machine Aa, our disk utilization is very high. Interactive response can sometimes lag; e.g., an ls of a directory in the busy partition can take 10-15 seconds to return.
So, how can we determine what causes the slowness on machine Aa?
Discussion:
Here are some example iostat readings, using
iostat -x -d <device> 1
First, machine A (the good machine):
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
cciss/c1d0 0.00 123.76 0.00 891.09 0.00 8118.81 9.11 1.64 1.84 0.06 4.95
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
cciss/c1d0 0.00 108.91 0.00 1273.27 0.00 11065.35 8.69 1.81 1.42 0.07 8.91
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
cciss/c1d0 0.00 3905.94 0.00 2447.52 0.00 50835.64 20.77 11.80 4.82 0.04 10.89
Now machine Aa. Notice how the w/s values are so much smaller, yet %util is 100:
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
cciss/c0d1 0.00 687.00 0.00 568.00 0.00 6144.00 10.82 144.54 292.86 1.76 100.10
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
cciss/c0d1 0.00 291.00 2.00 355.00 256.00 9104.00 26.22 122.17 425.53 2.80 100.10
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
cciss/c0d1 0.00 1239.00 1.00 603.00 128.00 15128.00 25.26 95.82 145.84 1.65 99.90
We are at a loss. What can we look for? I have changed the i/o scheduler around (to cpq, noop, deadline); this has made no difference to the stats of Aa. There is perhaps 60 Gig/day written to partition W. While an appreciable amount, the busier machine of the two seems to have no problem with it.
On machine Aa I have done
dd if=/dev/zero of=/file/in/partitionW bs=1024 count=20000000000
and get 189 MB/sec to partition W. If I try it to the regular root disk, I get 213 MB/sec. So it seems to be fairly fast.
Both machine A and machine Aa have an HP P400i RAID card. The card is set to cache writes, and the cache is 256 Meg Read and 256 Meg Write. Both machines' disks are configured in a RAID 5 hardware array. The devices, then, use HP's cciss driver and there is no software RAID or logical volumes built on top of them.
Machine Aa is an HP DL360 G6. Machine A is a G5 machine (older). But machine Aa has the problem...?
The HP Proliant utilities run on machine Aa show no issues with the drives. There are no orange warning lights on the front of the machine.
Is there anything else we can look at? Thanks.
-Mike Schwager