bash: How to reuse the search result of "find"

LessNux · March 14, 2012, 6:17pm

find . -type f -print0 | xargs -0 chmod 600

find . -type f

On bash, I would like to pass the search result of "find" to another command as well as to the standard output. The above code performs the same search twice -- once for "xargs -0 chmod" and another for stdout. I would like to spare this redundancy. I would like to search only once and would like to use the search result for another command and reuse it for the standard output.

Assume that the null character (0x00) is the only character that filenames never contain. Thus, assume that filenames may contain newline (0x0A) characters. In the above code, -print0 and -0 options specify the separator to be a null character.

If I allow myself to use a file to save the search result, then the work can be accomplished in the following manner.

find . -type f -print0 > /tmp/found.dat
cat /tmp/found.dat | xargs -0 chmod 600
cat /tmp/found.dat | tr \\0 \\n

However, I do not want to use a file to save the search result. So, I tried a variable to save the search result.

vFound="$(find . -type f -print0)"
echo "$vFound" | xargs -0 chmod 600
echo "$vFound" | tr \\0 \\n
#or 
#printf %s "$vFound" | xargs -0 chmod 600
#printf %s "$vFound" | tr \\0 \\n

However, the above code failed. It seems that bash removes null characters when bash expands the variable.

I also tried "tee" to make an attempt to split the search result to "xargs -0 chmod" and stdout. The following attempt with "tee" failed.

find . -type f -print0 | tee - | xargs -0 chmod 600

Can you show me how to use the search result of "find" for another command and reuse it for the standard output without saving the search result to a file? I failed in a method with a variable and a method with "tee".

Many thanks in advance.

Corona688 · March 14, 2012, 6:36pm

You could do this with a named pipe:

mkfifo mypipe
something_else_using mypipe &
find . -type f -print0 | tee mypipe | xargs -0 chmod 600

Something else has to be trying to read from mypipe before tee can write to it, otherwise, it will block until it can.

LessNux · March 15, 2012, 3:36am

Thanks Corona688 for reminding me of named pipes. However, the statement "mkfifo mypipe" creates a file "mypipe" in the current directory. I do not want to use any files explicitly. So, instead of Corona688's explicit named pipe, I have ended up with the following implicit named pipe.

find . -type f -print0 | tee >(xargs -0 chmod 600) | tr \\0 \\n

The greater-than symbol followed by a subshell is the implicit named pipe. Implicit named pipes do not need "mkfifo" statements. This single-line solution is successful and satisfies my initial demand.

By the way, the purpose of "tr \\0 \\n" is to convert back the null-char-delimited result into newline-delimited data for stdout. I might further modify the tr statement to the following so that possible newline characters in filenames be represented by question marks.

tr \\n\\0 ?\\n

LessNux · March 19, 2012, 4:11pm

If I need to use (or reuse) the search result later but not immediately following "find", I would like to save it into a variable. However, the search result contains null characters as separators, and bash removes null characters either upon variable assignment or upon variable expansion. To preserve null characters, the percent-encoding is useful.

Percent encoding is widely used for URL encoding. The following method applies percent-encoding to only the null character and the percent symbol itself.

#!/bin/bash
#percent-encoding for null character
PercentizeNull()
{
  sed "s/%/%25/g" | sed "s/\x0/%00/g"
}
DepercentizeNull()
{
  sed "s/%00/\x0/g" | sed "s/%25/%/g"
}
echo -ne "A\0B\0%\0" | hexdump -C
foo=$(echo -ne "A\0B\0%\0" | PercentizeNull)
printf %s "$foo" | DepercentizeNull | hexdump -C

The following example saves the search result of "find" into a variable with percent-encoding.

PercentizeNull()
{
  sed "s/%/%25/g" | sed "s/\x0/%00/g"
}
DepercentizeNull()
{
  sed "s/%00/\x0/g" | sed "s/%25/%/g"
}

vFound=$(find . -perm /o+rwx ! -type l -print0 2> /dev/null | PercentizeNull)

if test "$vFound" ; then
  echo "current permission"
  printf %s "$vFound" | DepercentizeNull | xargs -0 ls -ld
  printf %s "$vFound" | DepercentizeNull | xargs -0 chmod o-rwx
  echo "new permission"
  printf %s "$vFound" | DepercentizeNull | xargs -0 ls -ld
fi

By the way, does anyone know at which time bash removes null characters, at the time of variable assignment or at the time of variable expansion? Which time? Does bash never save null characters into a variable? Or, does bash save null characters into the variable, and does bash remove null characters when the variable is expanded?

Furthermore, without any encoding (such as percent encoding), is there any option to prevent bash from removing null characters?

Chubler_XL · March 19, 2012, 4:45pm

I think it's at assignment:

$ T=$(printf "A\x0B")
$ echo ${#T}
2
 
$ T=$(echo "AAA" | sed "s:A:\x0:"g)
$ echo ${#T}
0

I suspect bash uses the standard C string functions to deal with it's env vars, and they use NULL as a terminator. One could imagine that re-implementing a new string object that supported imbedded NULL chars wouldn't be worth the effort.

BTW, the contents of named pipes aren't stored in the Filesystem (the file is only a pointer, much like a /dev/ file, the actual data is stored in memory).

Also, how about using the -e sed param in PercentizeNull() and DepercentizeNull() to reduce the number of sed processes run:

PercentizeNull()
{
  sed -e "s/%/%25/g" -e "s/\x0/%00/g"
}

LessNux · March 20, 2012, 5:40am

chubler_xl:

I think it's at assignment:

I suspect bash uses the standard C string functions to deal with it's env vars, and they use NULL as a terminator. One could imagine that re-implementing a new string object that supported imbedded NULL chars wouldn't be worth the effort.

Also, how about using the -e sed param in PercentizeNull() and DepercentizeNull() to reduce the number of sed processes run:
PercentizeNull()
{
  sed -e "s/%/%25/g" -e "s/\x0/%00/g"
}

Thanks, Chubler_XL.

It was indeed awkward of my code to invoke "sed" consecutively multiple times. The subroutines should be polished as you suggested.

#percent-encoding for null character
PercentizeNull(){
  sed -e "s/%/%25/g" -e "s/\x0/%00/g"
}
DepercentizeNull(){
  sed -e "s/%00/\x0/g" -e "s/%25/%/g"
}

In addition, the following method with semicolon also spares consecutive multiple invocations of "sed".

#percent-encoding for null character
PercentizeNull(){
  sed 's/%/%25/g ; s/\x0/%00/g'
}
DepercentizeNull(){
  sed 's/%00/\x0/g ; s/%25/%/g'
}