Moving extremely large number of files to destination

Hello Techies,

m here with a small issue. Have read out all the relavent threads on unix.com but it was not so good sol. for me.

It's a simple & known issue. I want to move lots of files to dest folder. Actually I want to pick up 1 year older files, but that is even taking lots of time..Later on I want to archive them all.

This command is taking forever - it moved lots of files but i was bored waiting for it to comlplete all transfer (gave it 8 hrs..still in progress !)

find ./ -type f -mtime +79 -print | xargs -n1 -i mv {} /dest/

Below command returns error-

find ./ * -print | xargs -n1 -i mv {} /dest

error: Arg list too long

File name has fix pattern, based on date.
Avarage file size - 3090.
total size of files -5247124 kb.

I need to transfer them .. I dont mind with time taken - but it should give me solution..

Any help please?

Go back to using your original find but with the right mtime setting. If you have lots of files, it is going to take lots of time!

Hello there,

I didn't understand exactly your problem, you said

and then you say

So finally time is it important or not?

If all of those files have the same pattern, can you give some examples of those file names? I think I have a script that can do the job (however not necessarily very well optimized)
:slight_smile:

When you use an asterisk, the shell must substitute all of the filenames prior to invoking "find". When you use -n1 on xargs, you are getting a separate mv process for each file. That will take several extra hours.

You give no clues as to what system and what shell you are using. So I assume you are using solaris 8 with ksh. The find command is outputting one file per line. Let's change that to 200 files per line. I am guessing that 200 will work. You need to experiment with your filenames to find how large you can go.

find ./ -type f -mtime +79 -print | xargs  -n200 echo | while IFS="" read NAMES ; do eval mv $NAMES /dest ; done

And if your mv happens to support the -t ( --target-directory) option
you can use something like this (assuming your filenames do not contain pathological characters):

find . -type f -mtime +79 | xargs mv -t /dest

First of all thanks to all to provide such valuable inputs & sharing the knowledge...
Well what i did is just waited for overnight... the confusion here was 0- i didnt mind waiting for 8 hours - but in the end - it shouldn't throw me some stupid error like so many arguments,etc n waste my 8 hours of processing.

Good thing would be - if i had changed mtime into differetn categories...

Anyways now i have a different problem. I want compress all files of 2006 - old ones.

This doesnt work -
tar -cvf find ./ -type f -mtime +1176 -print

After lots of processing - it gives me error - no such file exists..

How to get this done.. ? I got gzip as well ..but i want it to process multiple files....

Btway, how do i know - what is server O/S on which i am working ? Unix environment related info...Lots
of basic questions here!! :slight_smile:
Once again, thanks !

Basics to state your environment.

# Operating system
uname -a
# Shell
echo ${SHELL}

How many files do you have to move and archive? I make it about 1.7 million files based on your stated average file size (assuming your 3090 is in bytes) and total size of 5 Gb.
# To count the files which match your condition
find <condition> -print | wc -l

Are all these files in the same source directory?
Are there subdirectories under your source directory?
Is the source directory and the destination directory under the same mountpoint?
Do any of the filenames contain space characters or other awkward characters such as asterisk?

Thanks, Methyl.

It was good to explore about uname & shell. I just knew about uname -a.

Coming to back on the trace -
All the files under same directory - on the same mound point. None of the file has funny characters like *,&,etc..

Your analysis are correct - for the size & numbers.. So let's continue with that -

I want to compress them all. -exec will certainly not work..

find ./ -type f -mtime +1176 | gzip -c > temp.gz

Above stuff is not sufficient for millions of files... It creates temp.gz but I am not sure -- WHETHER THIS WILL WORK WITH THOUSANDS OF FILES OR NOT?

Even below code is not what serves the problem -

find ./ -type f -mtime +1176 -print | xargs -n1 -i tar -cvf {}

Creating a temp file - which contains name of all the files & then accessing it to compress the files - look a long way around to me. Cant we do it in single line command?

Very similar question is posted in my different thread -

It's good if you can migrate - sorry for the confusions - Actually the tasks are different but associated here, so ..

(You may reply on the new thread or here as well , i am just concerned about solution !! :))

Thanks,
Kedar

find . -type f -mtime +1176 | 
  xargs tar cf - | gzip > all_"$(date +%F)".tgz

On a GNU system:

find -type f -mtime +1176 -print0 | 
  xargs -0 tar c | bzip2 > all_"$(date +%F)".tbz2

Thanks, Radoulov.

Will try this & let you know - if i am stuck up any where..

1) Will this delete existing files or that I will have to do manually?
(It would be good if this command can delete as well - otherwise i will have to fire this costly operation - find - once again to delete them. Although not a big problem..)

2) I am not able to untar this directly. what's the way to check if this tar is safe (okay) ?

Below command returns me errors..

tar -tvf older_then_2008_2.tgz
tar: directory checksum error

I need to make sure - the files are still good - before i delete uncompressed files...

Thanks!
Kedar

No, it will only archive the files. You should use another command to remove them:

find . -type f -mtime +1176 -exec rm +

If your find implementation does not support the + operator,
use xargs:

find . -type f -mtime +1176 | 
  xargs rm

You should use something like this:

gzip -dc older_then_2008_2.tgz | tar -tvf -

We really need to know what Operating System you are running and which is your preferred Shell.
Many unixes will not deal with single files above 2Gb - especially in tar. There may be other limitations in your OS which will require breaking the task down into manageable units.

hi,

the error is related to the largefile size , your mv command is failing when the file size more then 2 gb.

please be advised that your mv commad coming from large file aware directory..
type which mv ------> to get the current location of mv. if /bin
you can try /usr/local/bin/mv (exclusive path in find command)

instead of mv you have to specify the exclusive path.

i had this problem and corrected it this way.
you can check the man page of solaris for ---largefile
your file size is 5+gb.
hope it help

Okay..Thanks for confirming..I wasn't so confident on that!

Yes + wont work with. I got thousands of arguments so xargs is the way to go..Actually I am not facing any problem in removal of fies..It was primarily compression problem.
& below one works .. so I am good now ..

gzip -dc older_then_2008_2.tgz | tar -tvf -

Thanks for quick turn-around !!

@Methyl,
That is -
SunOS 5.8 Generic_117350-41 sun4u sparc SUNW,Sun-Fire-V440

I dont think i got files larger then 2 GB. That is the total size of files. My actuall problem is number of files - that is extremely large..

5 Gb is total size of all the files , returned by find command. that's not about a single file. Jambesh, you have given really useful info here..But, I didnt get your point..
It's just for my interest - you told some method to move files larger then 2 Gigs.. Can you please explain the stuff...? If you can give me commads/example - it would be easy to understand..

Btway,Thanks jambesh & methyl for sharing such valuable information.

Notice that find ... -exec rm + works like find ... | xargs:

  1. Just like xargs it should work with any number of arguments.
  2. Not like Solaris xargs, it will handle "pathological" filenames correctly.
  3. Solaris find supports it.