Copying Thousands of Tiny or Empty Files?

There is a procedure I do here at work where I have to synchronize file systems. The source file system always has three or four directories of hundreds of thousands of tiny (1k or smaller) or empty files. Whenever my rsync command reaches these directories, I'm waiting for hours for those files to finish copying. Is there any way to decrease the time it takes for those files to be copied?

The files are generated by an application that definitely needs them, and I'm in no position to dispense with them. I wondered about trying to 'tar' the directories first, but I suspect that if I do, I'll merely be moving the time spent copying them during rsync to the time spent to create the archive in the first place.

My rsync command is pretty basic:

rsync -auvlxHS /source_dir/ /dest_dir/

Usually /dest_dir/ is a new, empty file system so it really is a full copy, but sometimes there are actual synchronizations done. However, if there's a better approach than my rsync, I'd like to know.

Can you run multiple 'threads' of rsync - divide up the source tree and dest tree among several rsync processes?

rsync -auvlxHS /source_dir/dir1 /dest_dir/dir1
rsync -auvlxHS /source_dir/dir2 /dest_dir/dir2
rsync -auvlxHS /source_dir/dir3 /dest_dir/dir3

When you create lots of files and directories there is substantially more filesystem overhead than just writing to an existing file. You may want to do some serious filesystem tuning on the destination box, particularly the /dest_dir filesystem.

Also, having huge numbers of files in a single directory really bogs things down as well. readdir() takes a lot longer to complete a full scan of a directory for example...

What OS?

I could try doing multiple instances, that's a good idea at least to test and see if it has any speed increase over the single rsync process. The OS itself is HP-UX 11.11 but we expect to be moving to 11.30 soon-ish. The filesystem is vxfs and it was created with the 'largefiles' option because we also have files that are 8 to 12 gigs in size.

The application uses the small/empty files as some kind of "label" for information in a database that needs to be changed in an indexing process. I'm not clear on it as that portion isn't my responsibility. I've been told that they're necessary. As such, I'm hoping to increase the speed of transfer. However, tuning the FS might not be workable since I need both large files and these small/empty ones.

To add to that, when it is a true sync instead of a full copy, these empty files are always different, so basically it winds up being a full copy anyway. The files are deleted and new ones created on a daily basis during the week.

try man vxtunefs