Good morning,
for the impatient: I have a new backup-server and need to monitor, what the machine can do, what's the best way of finding that out?
I will tell the story right from the beginning, so you have a clue about what's going on:
I have a setup of three machines:
A new backup-server with Solaris 10/Intel, two bundled 1Gbit connections to a Netgear switch and fourteen 1,5T hard drives building a zraid with two groups a 7 drives.
Then I started rsync/scp processes from three different machines each after the other, now running simultaneously:
First one is a freshly installed FreeBSD 7.2 in a probuilt NAS case, two bundled 1Gbit connections to the very same switch and an internal 8-port raid. The controller splits the 3,5T in two junks, which I connected via ccd.
I started (on the backupserver) an rsync -varu in a screen session to it and it has by now transferred, according to du -sh, 330G of data in 17 hours.
Second is our primary fileserver. Debian Linux, 3ware Raid-Controller with 16 disks a 500G, Raid 5, six 1Gbit connections to the same switch as the backup server. It is running idle during the night and has transferred approx. in 17 hours.
Third one is a really old fileserver with Debian Linux, 4T Raid 5 and 1Gbit connection. It has transferred 40G in 100 minutes!
So, how do I monitor those machines? Most important would be to monitor the Solaris server, how fast it is able to write and read data. I think the filesystem should outrun the network connection by lightyears, true? But how can I monitor the network interfaces and how much spare bandwidth they could handle?
The third one is, well, a lemon and it is running along "for fun". But the primary fileserver should be replaced with a new Solaris machine. Yet, 330G in 17 hours is crap in an idle network on two idle machines.
I have to add of course, the files transmitted range from rather big chunks of 4G to tiny 50k files. Nevertheless, shouldn't the machines handle much more in such a long time. I need to find the bottleneck, is there something else but trying to flood the machine with twenty others?
PS: Is it normal for ZFS to cache data before writing to the disks (compression is on)? I noticed when I started the second scp, that the fileservers disks LEDs are flashing like crazy, but the backupserver's are dark for about 20 seconds, then some three second fireworks with disk activity, 20 seconds dark etc...