Hi Chubler, thanks, of course. That was pretty silly.
Then it gets a bit more complicated as would need to chop the file list up in snack size chunks and we could try this:
oldIFS=$IFS
IFS="
"
write_snack()
{
if a=$(grep -Fwil "$word" $snack); then
if ! $wordfound; then
printf "%s\n" "$word is found in: "
wordfound=true
fi
printf "%s\n" "$a"
fi
i=0
snack=""
}
snacksize=25 # Nr of files to feed to grep at a time
i=0 snack=""
filelist=$(find /path/to/files -type f)
while read word
do
wordfound=false
for f in $filelist
do
if [ $((i+=1)) -lt $snacksize ]; then
snack=${snack}${IFS}${f}
else
write_snack
fi
done
if [ $i -gt 0 ]; then
write_snack
fi
printf "\n"
done < input.txt > output.txt
The Last post by Scrutinizer suggested to me that parallelization might be feasible here.
The OP has nothing about the characteristics of the AIX box, but I seem to recall that I have used AIX on a 12-CPU dual 3090 that had a lot of processing power (regrettably only 32-bit, but that's another story).
So if the box has enough oomph, then firing off a number of background processes, each handling a number of files, could decrease the real time, which is apparently what concerns the OP.
this runtime for this code took about 20 mins vs. my script which has a runtime of 2 hours. YAY! Unfortunately, when i ran a comparison on the output file, I did I find several differences. I'll have to look at the individual files to see why they're different. Thanks!
The difference may be because in my script grep is using the -F option. This means literal match. If you don't do that with arbitrary strings then you may get unintentional matches. For example a single . (dot) means "any character". If input.txt contains regular expressions instead of strings, then you should leave out the -F-option...