Gfs2 vs xfs vs ext4

Looking for suggestions as to which filesystem to go with. I currently use gfs2 on hosts with 3.4tb useable. I understand gfs2 is being left behind but xfs and ext4 are not quite certified completely on CentOS 5.2. I have email storage hosts that have a decent i/o requirement and 12TB usable after raid 1+0. We tried ext3 but it was just too slow, I am not sure if ext4 is any faster or not. I also would like to hear peoples experience with recovery on these. How long to fsck, how successful are they, etc?

Any feedback is appreciated

FWIW, we use EXT3 and it works fine.

Large ext3 partitions can be slow to fsck, but aren't that bad in operation. It's also important to note that the default ext3 mount options are brain-dead for large, heavily-cached systems; for instance, a commit interval of 5 seconds is rather small, and the default 'ordered' writing mode is extremely safe but sometimes a bottleneck.

On the other hand ext3 is excellent at safety. I've seen ext3 recover from horrible abuse.

ext4 is somewhat faster but the difference is not gigantic. Its fsck is much faster than ext3's for partitions larger than hundreds of gigs. I don't feel it's quite mature, though. Only time will tell if it's as reliable as ext3.

Another filesystem you might consider is xfs. It's fairly mature, and designed for huge, fast transfers...

We tested ext3 before, it can't handle our load

With what mount options? As corona688 said there are *VAST* differences between the default options and a workload tuned setup.

Secondly it is not at all clear from what you have said whether it is the file system or your storage configuration that is the problem, nor is it possible for anyone to give you a good recommendation based only on the size of the storage.

  1. What type of storage are you using.
  2. What type of disks
  3. If using a controller based array, what you optimized for sequential or random operation.
  4. Will you mirror on-host or using hardware RAID
  5. What is your storage block size.
  6. What is the average size of your writes
  7. What is the breakdown of reads/writes
  8. Is it really optimal to create a 12TB LUN and put all you eggs in one basket, could you achieve better results with more smaller luns.
  9. Do you have file hotspots ( certain files heavily accessed )
  10. How many files per directory do you have ( files includes sub-directories) in the largest directory.

For a quick overvirew of options search for a presentation called "Choosing and Tuning Linux File Systems" written by Val Henson from the Intel Linux group.

  1. What type of storage are you using.
    Penguin x8dtn, 24 1tb disks
  2. What type of disks
    sata
  3. If using a controller based array, what you optimized for sequential or random operation.
    dont know
  4. Will you mirror on-host or using hardware RAID
    controller based raid 10. six disks per raid set
  5. What is your storage block size.
    64k
  6. What is the average size of your writes
    dunno, guess is a few meg. No more than 16m. These are email attachments
  7. What is the breakdown of reads/writes
    not sure
  8. Is it really optimal to create a 12TB LUN and put all you eggs in one basket, could you achieve better results with more smaller luns.
    yes, we are using 3tb
  9. Do you have file hotspots ( certain files heavily accessed )
    just newer ones a bit
  10. How many files per directory do you have ( files includes sub-directories) in the largest directory.
    millions

Millions of files in one giant directory isn't optimal in any filesystem I've ever heard of. I bet your giant untuned ext3 filesystem didn't have directory indexes enabled, which is highly reccomended for dirs larger than a few thousand entries.

Can find anything about this setup. Do you have specs/a link ?

Definitely not optimal for the amount of files or the access patterns you describe.

Sounds like it should be random optimized, but depends on the array or RAID controller and its capabilities whether or not this a selectable.

Seems like a reasonable choice.

That is the size of the objects not the size of the writes. iostats for the file systems would be good here.

This might be essential information, depends on the pattern, but caching correctly might improve performance drastically.

This is in all probability your main bottleneck, more file systems or directories with fewer files could improve this.

I was told they did have it turned on. Basically at the time we started this layout we were the largest email system in the world (still are I think). The only we we had to handle the i/o load we were getting was small files and basically deep random directories. reiser was the only file system that could handle it. Red Hat assigned about 5 engineers to work with us and couldn't get ext close. They built us a custom gf2 after numerous attempts. It is now in the public release, I forget which ver.

We've been screaming at you to tune it for days... If it was tuned in the first place, you might have said so! We're only trying to help you here.

If ext3 isn't fast enough for you, I don't think ext4 is going to be either. But performance in whatever filesystem you choose could be improved a lot by not having millions of files in one directory.

You may want to think about some sort of higher end LVM/filesystem configuration.

Simply changing filesystems without putting extra intelligence and disk I/O bandwidth between the OS and directory access is not going to help much.

I don't think the OP has posted any hard numbers on the I/O performance he is getting, nor exact specifics of his RAID setup (connection type, RAID blocksize, etc.) Without all that it's hard to solve his problem.

Can you point me to a resource that I can read more about high level filesystem config or putting extra intelligence and disk I/O bandwidth between the OS and directory access?