find
is not only a utility to find files - that is, to produce a list of filenames to be printed - but a "programmable commandline filemanager", so to say.
How is that done? The basic operation is find
finds all files and directories and prepares an initial "result set". Then you have one or more clauses which returns a logical value, TRUE or FALSE. Each file/directory in the result set is presented to the first clause. If it returns TRUE the file/directory is kept in the result set, otherwise it is dropped. If it is kept, it is presented to the second clause, etc..
An example:
find /some/path -name "foo" -print
The initial result set is all files/directories in /some/path
. This list is presented to the clause -name foo
and if the name of the file/directory is "foo" it is kept, otherwise dropped. What still is in the result set is then presented to the -print
clause, which just prints it, without modifying the result set further.
So far, so basic, but it is necessary to understand this mechanism of presenting one filename/directory name after the other to each of these clauses successively.
Remember i said "file manager" up there? Up to now we only produce - more or less tailored - lists of file-/directorynames. Now we want to actually do something with the files/directories found that way. For this there is a special clause: -exec
.
-exec takes a "template commandline and executes this template commandline with every file/diretory in the result set. An example:
find /some/path -name "foo" -type f -print
/some/path/dir1/foo
/some/path/dir2/foo
/some/path/dir2/subdir/foo
Now we replace the -print
with -exec
in this command:
find /some/path -name "foo" -type f -exec echo file found: {} \;
file found: /some/path/dir1/foo
file found: /some/path/dir2/foo
file found: /some/path/dir2/subdir/foo
What has happened? First, the {}
is the placeholder for the filename, which is presented to the clause. That is, -exec
executed these commands:
echo file found: /some/path/dir1/foo
echo file found: /some/path/dir2/foo
echo file found: /some/path/dir2/subdir/foo
Second, you need a way to tell the shell, into which you type the whole find
-command, where the "template-commandline" for -exec
ends and the normal commandline resumes. This is done by an (escaped) semicolon, hence the \;
at the end. Here is a more complex example i have annotated:
normal commandline resumes
|
template commandline ends here |
| |
<--------normal commandline-------------> <--template cmdl--> V <---->
find /some/path -name "foo" -type f -exec echo file found: {} \; -print
Notice, that you can use the -exec
-clause even to select from the result set: if the template command returns TRUE when executed with the file/dir the file/dir will be further included in the result set, otherwise it will be dropped. You can even have more than one -exec
-clauses, where some will only help shape the result set and the final one will actually do the work.
Finally, some performance considerations: consider the following example:
find . -name "*txt" -exec cat {} \;
This will produce a (potentially long) list of commands cat foo.txt
, cat bar.txt
, etc.. As this list could grow very long there will be many processes started which might tax the system (starting a process is actually "expensive" resource-wise). But cat
could be called this way:
cat file1 file2 file3 [...]
and this way one cat
-process would be started for a whole group of files and not for eeach one. This is what the +
is for. Use this:
find . -name "*txt" -exec cat {} +
To do exctly that: group the files in the result set and call cat
with each group instead of with each file individually.
I hope this helps.
bakunin