The Fastest for copy huge data

edydsuranta · September 16, 2014, 12:21am

Dear Experts,

I would like to know what's the best method for copy data around 3 mio (spread in a hundred folders, size each file around 1kb) between 2 servers?

I already tried using Rsync and tar command. But using these command is too long.

Please advice.

Thanks
Edy

Scrutinizer · September 16, 2014, 12:27am

You could copy the filesystem blocks and turn that into a filesystem on the other server, or use techniques of the underlying storage hardware (if it is on a SAN), or use a previous backup to restore on the other server and use rsync for the last bit.

If it is on ZFS then you could use that to take a snapshot and send it to the other side..

achenle · September 16, 2014, 12:38pm

How much data? What is your network bandwidth? What's "too long"? How fast do you need the data to be copied?

What kind of network and/or storage hardware do you have?

If you're bandwidth limited, the protocol doesn't matter much unless it's really inefficient.

If it's a LOT of data - hundreds of gigabytes or maybe even more - the fastest way is probably to put a new hard drive into the source server, copy the data to the new hard drive and then physically move the hard drive to the target server.

Corona688 · September 16, 2014, 1:03pm

What is a "mio"?

If your connection isn't saturated, it sounds like you have folders containing hundreds of thousands of tiny files. This is never fast.

The speed limit on creating/deleting files in crowded directories is is an operating system limitation. The problem is that when a file entry in a folder is created or deleted, other things have to wait for it. And the larger a folder is, the more time it takes to add a file since its name has to be checked against the others for duplicates and consistency.

You cannot get around this limit using C, Perl, assembly language, or any other "creative" solution. It takes faster disks and CPU's to speed up a file tree that inefficient.

gandolf989 · September 16, 2014, 1:14pm

Have you tried using tar to create a tarball, then copying the tarball to the other server and untaring it? Newer OS's have a --compress option, which might help.

However you copy it, you may want to copy over the directories in different threads, hence you get some level of parallel. You can do a find to get the folder names, then split up the list into separate files and have a different process migrate each folder.

Corona688 · September 16, 2014, 1:16pm

Read my above post to understand why this is unlikely to help.

Read my above post to understand why this is unlikely to help. Disks, and especially folders, are not parallel.

jim_mcnamara · September 16, 2014, 1:24pm

IF you have an ssh connection and have set up ssh-keys for an account that can write to /.
Where /parent is the the path primary member of /parent/path/to/files/

tar cf - ./path/to/files | ssh special_user@remoteserver ' cd /parent && tar xBf - '

This runs in one about half of the time of:

tar cf tarfile.tar
scp tarfile.tar remoteserver:
ssh remoteserver ' tar xf tarfile.tar'

achenle · September 16, 2014, 2:51pm

jim mcnamara:

IF you have an ssh connection and have set up ssh-keys for an account that can write to /.
Where /parent is the the path primary member of /parent/path/to/files/
tar cf - ./path/to/files | ssh special_user@remoteserver ' cd /parent && tar xBf - '
This runs in one about half of the time of:
tar cf tarfile.tar
scp tarfile.tar remoteserver:
ssh remoteserver ' tar xf tarfile.tar'

When copying data via ssh pipe, always add "-e none" to the command in case any characters in the stream match the ssh escape characters:

tar cf - ./path/to/files | ssh -e none special_user@remoteserver ' cd /parent && tar xBf - '

This is really moot, though, until we get more details from the original poster.

Scrutinizer · September 16, 2014, 5:32pm

@achenle: the OP specified 3 Mio (3 million) files that are on average 1 KiB in size, so that should be in the order of 3 GiB. With a SATA disk with 120 (sequential, but small) iops, so with a 1KlB IO Size that should then theoretically be 3,000,000 IOS / 120 IOPS = 25000 seconds, i.e. around 7 hours for the data alone, limited either by the reading or the writing system (probably the writing side is faster since the IO's will more sequential in nature). This is excluding the IOPS required for the metadata. If the filesystem can do write combining / prefetching then perhaps that may be a bit more efficient. If the filesystem has a larger minimum block size, then that would not matter much for speed, since the block size would still be smallish.

When we take the disk out and put it in the other server we need another stream and the same amount to copy it to the disk on to the other server plus sneaker time..

If we would use the netwerk, we would probably not need much more time and we could do it with a single stream, reading from one computer, writing onto the other (the network would not be a bottleneck here..)... So that should take in the order of half the time..

If we we use any of the block copy methods in my post, there would be no need to copy the files individually nor all that metadata manipulation and can read large chunks of data with big IO's (for example 1 MiB per IO) which will be significantly faster probably in the order of 100MB/s so it should theoretically take in the order of 30-60 seconds for the data alone, if the network is not a bottleneck...

Of course if the data is on a large filesystem then that whole filesystem would need to be copied unless the method is smart like filesystem dumping methods or ZFS send / receive, which only copy the parts that are in use..

achenle · September 16, 2014, 6:40pm

So that's what "mio" means...

120 IO operations per second from a SATA drive is quite optimistic. A single 7200 rpm SATA disk is realistically more likely to get about 60-70 IO operations per second, because the small reads in this case are not likely to be sequential - they'll effectively be random IO operations. If it's a 5400 rpm disk, the number would be even less.

And if atime modification isn't turned off, every read operation that reads a file will generate a write operation to update the inode data for that file.

So that's probably somewhere between 6 and 9 million IO operations because metadata has to be read to even find each file. Call it 6 million IO operations, and assume the disk can do 60 IO operations per second. That's 100,000 seconds, More like 28 hours. And that assumes the disk isn't servicing other IO operations.

Why not just share the file system via NFS and let other systems access the files that way.

jim_mcnamara · September 16, 2014, 10:07pm

Well, given our lack of information, there is no real answer.

IOPS are not knowable - our SAN does 12000 iops continuously if required. The sata disk on my desktop does maybe 70. And if the file systems were zfs and were on a SAN, then the "copy" time is the time it takes to type four or five zfs commands.

So maybe we are are comparing apples to elephants. Do not know.

In any event, when an app (or a user) is allowed to clutter a filesystem as described, there is not a lot of hope for it. A simple find or ls command can take hours to complete. On some systems. Copying it as is does not seem like a best practices idea to me.

fpmurphy · September 17, 2014, 7:54pm

cpio in pass through mode is generally regarded as being much faster than tar.