RAID0 "stripes" the data across the three actuators you have and the stripe size (that's official RAID speak) is the minimum allocation. So if the stripe is 2k then the first 2k bytes of a file is written to the first drive, the next 2k to the second drive, and the third 2k to the third drive. It then goes back to the first drive, and so on.
So it's not difficult to see that writing lots of small files will give unpredictable results respecially if they're less than 2k each. Also, read requests can only be satisfied be reading the drive(s) where the files were written.
So your results are misleading.
If you have a desire to test this then you need to do something like......
Create a 4GB file on (ideally) an internal drive not part of this RAID0 array. Kick all the users off if you can and then copy this 4GB to the RAID filesystem and take your measurements whilst that's going on. It won't be precise but should give you a better set of figures.
True, this will be off by the overhead of /dev/zero , wouldn't that be negligible given the bandwidth of disks and the memory interface (which are apart some orders of magnitude)?
First of all, my problem does not happen any more. I created the raid with sdb , sdc and sdd on April 11 at 09:35.
Until 11:32, sdd was very busy, then until 14:51, sdc was very busy.
Since then (3 days), the 3 disks are always under the same moderate load altogether (0-20%). The server is used by 5 graphic designers manipulating quite large files (100M-2G).
I ran some tests and the results leave me quite puzzled. So I created simultaneously 10 files. 1GB each. But all the load went on sda . Leaving sdb , sdc and sdd with a moderate 20% load.
The command:
for i in {1..10}; do
file=$(mktemp /galaxy/XXXXXXX)
echo $file >> /galaxy/dd.files
dd if=/dev/zero of=$file bs=1G count=1 &
echo $! >> /galaxy/dd.pids
done
You are producing stats of the disks, but the stripe is made with partitions on those disks.. Are there other partitions on those disks. Also, what filesystem was created on the striped md device and where is it mounted?
/dev/zero is a character (c) device on the root filesystem in the directory /dev
sda contains / , /tmp , /usr , /var and swap . sdb , sdc and sdd contain only 1 partition each: sdb1 , sdc1 and sdd1 . md0 is made of sdb1 , sdc1 and sdd1
Disk /dev/sda: 2000GB
Number Start End Size File system Name Flags
1 17.4kB 2000MB 2000MB ext4 boot
2 2000MB 4000MB 2000MB ext4
3 4000MB 9000MB 5000MB ext4
4 9000MB 14.0GB 5000MB ext4
5 14.0GB 22.0GB 8000MB linux-swap(v1)
6 22.0GB 32.0GB 10.0GB ext4
7 32.0GB 2000GB 1968GB ext4 lvm
Disk /dev/sdb: 2000GB
Number Start End Size Type File system Flags
1 512B 2000GB 2000GB primary raid
Disk /dev/sdc: 2000GB
Number Start End Size Type File system Flags
1 512B 2000GB 2000GB primary raid
Disk /dev/sdd: 2000GB
Number Start End Size Type File system Flags
1 512B 2000GB 2000GB primary raid
Disk /dev/md0: 6001GB
Number Start End Size File system Flags
1 0.00B 6001GB 6001GB ext4
---------- Post updated at 17:36 ---------- Previous update was at 17:13 ----------
I found the problem.
The reason why sda is under heavy stress when using dd on md0(sdb,sdc,sdd) is the swap.
for i in {1..10}; do
dd if=/dev/zero of=$(mktemp /galaxy/XXXXXXX) bs=1G count=1 &
done
Using 10 times dd with bs=1G require 10G memory which I don't have. So the system uses the swap on sda and md0 is quietly waiting doing nothing.
for i in {1..10}; do
dd if=/dev/zero of=$(mktemp /galaxy/XXXXXXX) bs=256M count=4 &
done
Using 10 times dd with bs=256M require 2.5G memory which I have. So all the stress is on md0 .