xargs vs exec with find:

Hi All,

i'm trying to create a tar of all the .txt files i find in my dir . I've used xargs to acheive this but i wanted to do this with exec and looks like it only archives the last file it finds . can some one advice what's wrong here :

find . -type f -name "*.txt" -print0 | xargs -0 tar -cf txtarchive.tar

$ tar -tvf txtarchive.tar
-rw-r--r-- user/None  123 2012-06-30 17:39 ./employe.txt
-rw-r--r-- user/None   18 2012-07-04 12:59 ./file.txt
-rw-r--r-- user/None    5 2012-07-04 12:59 ./file2.txt
-rw-r--r-- user/None   13 2012-07-14 14:34 ./files.txt
-rw-r--r-- user/None   72 2012-07-12 14:12 ./filesfound.txt
-rw-r--r-- user/None 4938 2012-07-05 09:53 ./sql.txt
-rwxrwxrwx user/None    6 2012-02-15 16:35 ./users.txt

the same action but using exec:

$ find . -type f -name "*.txt" -exec tar -cvf archivetxt1.tar '{}' \;
./employe.txt
./file.txt
./file2.txt
./files.txt
./filesfound.txt
./sql.txt
./users.txt

$ tar -tvf archivetxt1.tar
-rwxrwxrwx chaitanya/None    6 2012-02-15 16:35 ./users.txt

can some advice whats wrong with my usage of exec here ?

cheers

You should use update. Actually you are creating a new tar archive each time overwriting the already existent (if) so only the last found txt will be in tar archive. So it would look like:

find ./ -type f -name "*.txt" -exec tar -uf myarchives.tar '{}' \;
tar -tvf myarchives.tar

The files are being added in the order they are found by find, not alphabetical or any other. In this case already existing files in tar will remain there even if don't exist anymore in the dir, only existent will be updated.
----------------
To include present and only present remove first the existing tar:

rm myarchives.tar
find ./ -type f -name "*.txt" -exec tar -uf myarchives.tar '{}' \;
tar -tvf myarchives.tar

-----------------
To always append and don't update existent and don't remove already non existent you can use append instead of update:

find ./ -type f -name "*.txt" -exec tar -rf myarchives.tar '{}' \;
tar -tvf myarchives.tar

----------------

You can run the exec command at the end of the find search, and redirect all found arguments at once (instead of executing an independent command for each found argument). This is much faster when there are a lot of files, but can cause problems when the list of arguments is too high:

find ./ -type f -name "*.txt" -exec tar -cvf txtarchive.tar {} + ; 

--------------------
With this the maximum number of arguments is solved (thanks to jlliagre for the tip), but the tar must be deleted first (it executes the exec command more than once but much less times than the number of arguments).

rm myarchives.tar
find . -type f -name "file-*.txt" -exec tar uf myarchives.tar {} +
1 Like

cheers tribe, that worked.

I guess xargs builds a file list first before executing the command .

looks like exec is not doing that, do you know of any switch or anything that builds the file list first and then executes the command ?

Thanks

So you want to force the create flag?

You can run the exec command at the end of the find search, and redirect all found arguments at once:

find ./ -type f -name "*.txt" -exec tar -cvf txtarchive.tar {} + ; 
1 Like

yes, this is what i wanted. it works like a charm. thank you for this.

cheers

Keep in mind that if the size/number of the matching filenames exceeds what can be passed to tar in one invocation, that will silently revert to the original problem, of tar clobbering the archive generated by the previous iteration.

For archiving files found with find, pax or cpio are much more convenient than tar, since they can read the list on stdin.

Regards,
Alister

1 Like

Exactly, there is a problem if the number of txt existing files exceed the maximum number of arguments that can be handled. For example this will fail on my system:

for i in {1..50000}; do echo >  file-$i.txt ; done
find ./ -type f -name "file-*.txt" -exec tar -cf txtarchive.tar {} + ;

There are 50000 txt files in dir:

find ./ -name "file-*.txt" | wc -l
50000

While in the tar there are only 3138 and no error was shown:

tar -tf txtarchive.tar  | wc -l
3138

-----------
On the other hand, independent exec with find for each argument is terribly slow when the number of arguments is high.

Could you provide examples with pax or cpio ?

This should work if for some reason one insist using tar:

touch txtarchive.tar && find . -type f -name "file-*.txt" -exec tar uf txtarchive.tar {} +
1 Like
find path -type f '*.txt' | pax -w > txtarchive.tar

Many tar implementations have an option that allows them to read a file list from a file, allowing for find's output to be redirected and then read. In my opinion, not a very elegant solution. If I recall correctly, BSD tar uses -I and GNU tar uses -T . Like ps , tar is a ubiquitous utility whose usage, for anything non-trivial, varies a great deal between platforms of different lineage.

Regards,
Alister

1 Like

Thank you, it worked with pax and cpio after the proper fixes.

find path -name '*.txt' -type f  | pax -w > txtarchive.tar
find path -name '*.txt' -type f | cpio -ov -H ustar -F txtarchive.tar