Start copying large file while its still being restored from tape

Hello, I need to copy a 700GB tape-image file over a network. I want to start the copy process before the tape-image has finished being restored from the tape. The tape restore speed is about 78 Mbps and the file transfer speed over the network is about 45 Mbps I don't want to use a pipe, since that will limit the tape drive speed to 45Mbps (and I want it to run at full speed). Ideally I would like to have some kind of temporary file used as a buffer that I could write to and read from simultaneously. Restoring the image from tape takes about 2hrs 40min, and copying the file over the network about 4hrs. So running both operations one after the other takes 6hrs 40min. I'm looking at ways of reducing the total time by starting the copy process while the tape is still restoring the file. Note: I don't want to restore the tape image directly over the network for various reasons, including permissions issues etc... I have seen the mbuffer command, which seems to offer what I need on a linux platform. However I'm using the shell on a Mac OS which doesn't provide that. Any suggestions on how I could accomplish this would be much appreciated. Kind Regards Swami

Assuming the file is text try starting up tail -f as soon as the file starts to appear on the disk. ie., almost right after starting the restore operation. Set up ssh keys on the remote node first.

tail -f /path/to/file_being_restored | ssh  me@remote "cat > /path/to/new/file/newcopy_of_file "

Thanks Jim, Actually - it is a binary file. I don't need ssh since its just transferring the data to a shared directory, which I can access just like a normal dir Here is my existing command: taperead -b 1048576 > tape.img Should I do something like this: taperead -b 1048576 > tape.img & tail -f tape.img > /volumes/shared/tape.img What if the tape.img file doesn't yet exist when the tail command is executed? The tape drive takes a little time to configure itself before it starts restoring the data

I'd put tail in the background instead of your tape restore. You can just create an empty file to make sure tail doesn't throw an error.

: > localfile # Truncate or create zero byte file
tail -f filename > /path/to/nfsfile &
restore_from_tape > localfile

---------- Post updated at 10:22 AM ---------- Previous update was at 10:19 AM ----------

The trouble comes from how to tell tail when it's finished. It's binary-safe I think, but only writes entire lines -- inconvenient when your file may not actually end in a newline. You may have to append a newline onto your local file to kick the last 'line' out of it: echo >> filename Then wait for the file sizes to be equal, kill tail, and truncate both files one byte shorter.

Thanks for the reply - sounds like quite a bit of a hack to get it to work though. Does anyone know if it's possible to compile mbuffer for Mac OS X. Would be a much nicer solution....

Don't have a fully modern mac to check that on right now, but I suspect not, since it uses clock_gettime, a POSIX-compliant feature which the older version of OSX I have available definitely doesn't have. It was developed on Linux and Solaris so may need GNU features too.

I also checked in the fink repository, don't see it.

I suppose if the network is guaranteed to be slower than the tape you could just cat it, it'll never catch up until the tape's done. cat shouldn't care about the file size changing, it goes until EOF whatever that may be. Give the tape a head start to build up some steam.

restore-process > localfile &
sleep 10
cat < newfile > /path/to/remotefile

Thanks - yes I was also thinking of this. Much simpler!

I'd still recommend some way to verify the entire file made it across, like an md5 sum on both sides.

Read-behind-write is fraught with lots of chances to not get the entire file. You pretty much have to know that your entire software stack is set up properly. Even in the sleep-then-cat example, and even if the network is supposedly guaranteed to be slower than the tape, any hiccup reading the tape - or in the data path from tape to disk - could allow the cat to catch up with the file end.

Never mind what happens if you run into a restore process that sets the file length prior to actually writing out the data....