Can someone please help me optimize my code (script searches subdirectories)?

Hi Chubler, thanks, of course. That was pretty silly.

Then it gets a bit more complicated as would need to chop the file list up in snack size chunks and we could try this:

oldIFS=$IFS
IFS="
"
write_snack()
{
  if a=$(grep -Fwil "$word" $snack); then
    if ! $wordfound; then
      printf "%s\n" "$word is found in: "
      wordfound=true
    fi
    printf "%s\n" "$a"
  fi
  i=0
  snack=""
}

snacksize=25          # Nr of files to feed to grep at a time
i=0 snack="" 
filelist=$(find /path/to/files -type f)
while read word
do
  wordfound=false
  for f in $filelist
  do
    if [ $((i+=1)) -lt $snacksize ]; then
      snack=${snack}${IFS}${f}
    else
      write_snack
    fi
  done
  if [ $i -gt 0 ]; then
    write_snack
  fi
  printf "\n"
done < input.txt > output.txt

Hi.

The Last post by Scrutinizer suggested to me that parallelization might be feasible here.

The OP has nothing about the characteristics of the AIX box, but I seem to recall that I have used AIX on a 12-CPU dual 3090 that had a lot of processing power (regrettably only 32-bit, but that's another story).

So if the box has enough oomph, then firing off a number of background processes, each handling a number of files, could decrease the real time, which is apparently what concerns the OP.

Best wishes ... cheers, drl

this runtime for this code took about 20 mins vs. my script which has a runtime of 2 hours. YAY! Unfortunately, when i ran a comparison on the output file, I did I find several differences. I'll have to look at the individual files to see why they're different. Thanks!

The difference may be because in my script grep is using the -F option. This means literal match. If you don't do that with arbitrary strings then you may get unintentional matches. For example a single . (dot) means "any character". If input.txt contains regular expressions instead of strings, then you should leave out the -F-option...