What -exec tar -cvf * /something/ * does only depends on your current directory, because the *'s are evaluated by the shell before it is run. exec is not a shell and would not handle the *'s if you escaped them, either.
I want it to tar up the results of the find command. There will be other parameters, such as size, age, etc.. But for this example I only included -type f
>> /directory/backup/log/2.log is similarly handled by shell, not exec, but in this case probably does close to what you want, except it will capture all stdout, not just ls. This may not matter if nothing else prints to stdout.
Since tar may be run multiple times, you need to use the append option, not the create option.
How about:
$ tar -rf archive.tar # Create empty tar file to append to
$ find testout -type f -exec echo ls -latrd '{}' ';' -exec echo tar -rvf archive.tar '{}' ';' -exec echo rm '{}' ';'
ls -latrd testout/testfile1
tar -rvf archive.tar testout/testfile1
rm testout/testfile1
ls -latrd testout/testfile2
tar -rvf archive.tar testout/testfile2
rm testout/testfile2
ls -latrd testout/testfile3
tar -rvf archive.tar testout/testfile3
rm testout/testfile3
# remove echos to actually run these commands instead of printing them
If you have GNU find, you can use + instead of ; for increased efficiency as it will bundle several files into each call:
$ find testout -type f -exec echo ls -latrd '{}' '+' -exec echo tar -rvf /absolute/path/to/archive.tar '{}' '+' -exec echo rm '{}' '+'
ls -latrd testout/testfile1 testout/testfile2 testout/testfile3
tar -rvf /absolute/path/to/archive.tar testout/testfile1 testout/testfile2 testout/testfile3
rm testout/testfile1 testout/testfile2 testout/testfile3
$
Here is what I am trying to do, and I could be taking the wrong approach.
I have a bunch of directories that need to have their files purged, based on certain criteria. For example.
/folder/one <- Delete files that are 30 days old, and more than 2MB
/folder/two <- Delete files that are 7 days old, and have the extension PDF
/folder/three <- Delete files that are 7 days old, extension PDF, more than 2MB
We have a script that runs that does a basic find, and exec -rm -f, but we want to add logging, and take a compressed backup of the files, and throw them into a preserve directory for X days until we need them.
I take it this script would run automatically at intervals. Given that, I think you could make your approach work. I'd break the archiving and deleting into two steps, so you can bail in case of error before files are trashed, rather than after.
Also, Once a tarball is created and compressed, it's essentially uneditable, so you have to compress it after you're finished appending to it, not during.
# Logfile for errors, >&2 and any errors printed by tar/gzip/etc
exec 2> /path/to/errorlog
# Logfile for files, captures default stdout
exec 1> /path/to/filelog
TSTAMP=$(date +%Y-%m-%d)
TARBALL=/path/to/folder/$TSTAMP-one.tar
echo "$(date '+%Y-%m-%d %H:%M:%S') $0 Beginning execution" >&2
echo "# Archiving to $TARBALL"
if [ -e "$TARBALL" ] || [ -e "$TARBALL".gz ]
then
echo "$(date '+%Y-%m-%d %H:%M:%S') $TARBALL already exists, refusing to overwrite" >&2
exit 1
fi
tar -rf "$TARBALL" # Create empty tar file to append to
if ! find one -type f -exec echo ls -latr '{}' '+' -exec echo tar -rvf /absolute/path/to/archive.tar '{}' '+'
then
echo "$(date '+%Y-%m-%d %H:%M:%S') Creating archive failed" >&2
rm -f "$TARBALL"
exit 1
fi
if ! gzip "$TARBALL"
then
echo "$(date '+%Y-%m-%d %H:%M:%S') Couldn't compress $TARBALL" >&2
exit 1
fi
if ! find one -type f -exec echo rm '{}' '+'
then
echo "$(date '+%Y-%m-%d %H:%M:%S') Error removing files" >&2
exit 1
fi
echo "$(date '+%Y-%m-%d %H:%M:%S') $0 completed successfully"
No. Do not use -exec ... + in cases like this. If there are enough files to trigger an invocation of one of these -exec primaries before the find has processed the entire file hierarchy, the list of files processed by each -exec primary is likely to have a different set of operands that the other -exec primaries. For example, the 1st invocation of ls might process 100 files, the 1st invocation of tar might process 95 files, and the 1st invocation of rm might process 105 files. The 2nd invocations of ls and tar will then fail because the 1st invocation of rm will have removed some of the files before they were listed and archived.
If there aren't enough files in the file hierarchy being processed by find to trigger invocations of of those tree utilities until the entire file hierarchy has been traversed, all three utilities could be run in parallel again allowing rm to remove some or all of the files before they are listed and archived.
I see that you have chosen to ignore the problems I mentioned in post #9 in this thread. You do so at your own peril!
From the error you have shown us, we might guess that one or more of the files you are processing has a hyphen as the first character of the pathname that is being passed to ls by find . But, that can't be the case with the command line you have shown us since every pathname that find would pass to ls would have to start with /directory/toscan and the options you have find passing to ls do not include -v .
Are you absolutely positive that the diagnostic you have shown us from ls came from one of the invocations of ls in the find command above?
Then use one find command. But modify Corona's code as follows:
# Logfile for errors, >&2 and any errors printed by tar/gzip/etc
exec 2> /path/to/errorlog
# Logfile for files, captures default stdout
exec 1> /path/to/filelog
TSTAMP=$(date +%Y-%m-%d)
TARBALL=/path/to/folder/$TSTAMP-one.tar
echo "$(date '+%Y-%m-%d %H:%M:%S') $0 Beginning execution" >&2
echo "# Archiving to $TARBALL"
if [ -e "$TARBALL" ] || [ -e "$TARBALL".gz ]
then
echo "$(date '+%Y-%m-%d %H:%M:%S') $TARBALL already exists, refusing to overwrite" >&2
exit 1
fi
tar -rf "$TARBALL" # Create empty tar file to append to
echo ls -ltr "$@"
if ! echo tar -rvf /absolute/path/to/archive.tar "$@"
then
echo "$(date '+%Y-%m-%d %H:%M:%S') Creating archive failed" >&2
rm -f "$TARBALL"
exit 1
fi
if ! gzip "$TARBALL"
then
echo "$(date '+%Y-%m-%d %H:%M:%S') Couldn't compress $TARBALL" >&2
exit 1
fi
if ! echo rm "$@"
then
echo "$(date '+%Y-%m-%d %H:%M:%S') Error removing files" >&2
exit 1
fi
echo "$(date '+%Y-%m-%d %H:%M:%S') $0 completed successfully"
amd call it something like archive_and_delete . If AIX has xargs use:
find .... -print | xargs archive_and_delete
Even better if you can use
find .... -print0 | xargs -0 archive_and_delete
If AIX does not have xargs use:
find .... -exec archive_and_delete {} '+'
CAVEAT: I haven't tried the above script and I may even have introduced bugs into it with my edit. I am also assuming that you will continue to use the -type f directive to pass filenames rather than directory names.
This is a problem because rm may remove a file before it is listed by ls and archived by tar .
The -exec ... + primary gathers arguments for each invocation of the specified utility with the guarantee that the arg list used will not exceed the system's ARG_MAX limit. It does not use a fixed number of operands to be passed to a utility when it is invoked. Since the utility name and argument list for rm just includes rm before the list of pathname operands, the argument list for ls include the utility name and the options ( ls -latr ) before the pathname operands, and the argument list for tar is even longer ( tar -rvf /directory/foroutput/archive.tar ), there is a chance that the number of pathnames given to tar may be less than the number of pathnames given to ls which may also be less than the number of pathnames given to rm . Therefore, the first invocation of rm may remove one or more files before the second invocation of ls or tar have a chance to process them.
I don't know whether or not the implementation of find on the original poster's system does this or not. The standards say this about -exec ... + :
The text marked in red above clearly allows invocations of the three utilities in the three -exec primaries to be invoked in any order and sequentially or in parallel as long as each of the utilities that needs to be invoked more than once completes processing earlier sets of pathnames for that -exec primary before it is invoked again to process a later set of pathnames for that -exec primary.
Probably it should collect them in parallel but execute them from left to right.
I have found different implementations of {} + , and some are buggy. I suspect that AIX find is buggy, too.
--
A method to run an 'embedded' shell script
find /directory/toscan -type f -exec bash -c '
ls -ltar "$@"
tar -rvf /directory/foroutput/archive.tar "$@"
rm "$@"
' bash {} +
I haven't seen any reports about UNIX-branded implementations (including AIX) of find behaving contrary to the requirements of the standards in the last decade where the given command-line met the requirements stated by the standards. But, old systems and systems that aren't branded (or tested for conformance) do still exist.
On systems where find does meet the standard's requirements, your suggestion above looks like it should work as long as the code marked in red is removed, noting of course that the list of files produced will not be sorted in its entirety if the list of pathnames to be processed is too long to just invoke bash once.
But, if a file can't be archived because tar can't read it, the file may still be removed even though it wasn't archived. If the original poster wants to keep files that couldn't be listed and archived, you would need something more like:
find /directory/toscan -type f -exec bash -c '
ls -ltr "$@" &&
tar -rvf /directory/foroutput/archive.tar "$@" &&
rm "$@"
' {} +
to keep sets of files where one or more files in the list failed, or one of the two following suggestions:
find /directory/toscan -type f -exec bash -c '
for path in "$@"
do ls -ltr "$path" &&
tar -rvf /directory/foroutput/archive.tar "$path" &&
rm "$path"
done
' {} +
or:
find /directory/toscan -type f -exec ls -ltr {} \; -exec tar -rvf /directory/foroutput/archive.tar {} \; -exec rm "$@" {} \;
to only keep individual files that weren't successfully archived, but, of course, these will run MUCH slower than the other suggestions and the list of files produced by these will be in the order in which they are found in the searched file hierarchy; not in reverse time order (even in subgroups in the 1st suggestion of these last two).
Note that there is no need for the ls -a option when regular filenames are given as operands (even if their name does start with a <period> character).
@Don, the -exec requires to set the argv[0] for a script interpreter like bash .
This is certainly true for all LUnix - otherwise process names would always be the script interpreter (e.g. bash ).
For demonstration: