FInd all string occurences

SIMMS7400 · December 2, 2016, 1:00pm

Hi Folks -

I have a need to find a string occurrences in a list of *.txt files.

My code is as follows:

find . -name "*txt" -exec cat {} + | grep -ic " 0"

I was wondering a few things.

Is that an acceptable method to achieve my goal?
A. It works as expected but just curious
Is there a way to output the total # of found occurrences with a file name, then total everything up into 1 value?

Thank you all!

Corona688 · December 2, 2016, 1:18pm

Why run cat when you can run grep?

This will list counts from all files:

find . -name '*txt' -exec grep -icH " 0" '{}' '+'

This will add a TOTAL line to the end:

find . -name '*txt' -exec grep -icH " 0" '{}' '+' | awk -F: '{ T += $2 } 1; END { print "TOTAL:" T }'

Same thing, but excludes files with zero matches:

find . -name '*txt' -exec grep -icH " 0" '{}' '+' | awk -F: '{ T += $2 } $2+0; END { print "TOTAL:" T }'

grep -H forces grep to always print the related filename.

SIMMS7400 · December 2, 2016, 1:24pm

Coronaa -

Thank you so much! I will test now.

If I wanted to spool the results to a text file instead of print to screen, would I just simply do this:

find . -name '*txt' -exec grep -icH " 0" '{}' '+' | awk -F: '{ T += $2 } 1; END { echo "TOTAL:" T  >>results.txt }'

Thank you!

Don_Cragun · December 2, 2016, 1:35pm

No. There is no awk echo command. Try:

find . -name '*txt' -exec grep -icH " 0" '{}' '+' | awk -F: '{ T += $2 } 1; END { print "TOTAL:" T }' > results.txt

if you want to replace the contents of results.txt with the output from this run, or:

find . -name '*txt' -exec grep -icH " 0" '{}' '+' | awk -F: '{ T += $2 } 1; END { print "TOTAL:" T }' >> results.txt

if you want to append the output from this run to that file's previous contents.

SIMMS7400 · December 2, 2016, 4:40pm

Folks -

Thank you very much! It's working as expected!!

SIMMS7400 · December 3, 2016, 9:00pm

Hi Guys -

I have one more request. I need to exclude or ignore a search string from my search.

The string is as follows:

I need to add it to the following code:

find . -name '*txt' -exec grep -icH " 0 " '{}' '+' | awk -F: '{ T += $2 } 1; END { print "TOTAL:" T }' >>TCP_FinSAP_20161203.txt

Thank you!

RudiC · December 4, 2016, 6:04am

Needs a slightly different approach - use awk in the first place instead of grep | awk :

find . -name "*.txt" -exec awk '{n=gsub (/ 0/, "&"); m=gsub (/"AC_STAT" 0/, "&"); CNT[FILENAME]+=n-m; TOT+=n-m} END {for (c in CNT) print c, CNT[c]; print "Total", TOT}' {} +

MadeInGermany · December 4, 2016, 6:46am

The -exec awk can be run several times; then the "Total" does not work as intended.
So the pipe must be there...

RudiC · December 4, 2016, 6:56am

Thanks for pointing that out. Try

find . -name "*.txt" -exec awk '{CNT[FILENAME]+= gsub (/ 0/, "&") - gsub (/"AC_STAT 0/, "&")} END {for (c in CNT) print c, CNT[c]}' {} + | awk '{TOT += $2} 1; END { print "Total", TOT}'

SIMMS7400 · December 4, 2016, 7:33am

Thank you, RUdi!

Which portion is my original search string though?

This portion?

(/ 0/

RudiC · December 4, 2016, 7:39am

Yes. You can also use " 0" .

SIMMS7400 · December 4, 2016, 8:23am

Thank you!

So:

find . -name "*.txt" -exec awk '{CNT[FILENAME]+= gsub (/" 0"/, "&") - gsub (/"AC_STAT 0/, "&")} END {for (c in CNT) print c, CNT[c]}' {} + | awk '{TOT += $2} 1; END { print "Total", TOT}'

Or is it:

find . -name "*.txt" -exec awk '{CNT[FILENAME]+= gsub (" 0", "&") - gsub (/"AC_STAT 0/, "&")} END {for (c in CNT) print c, CNT[c]}' {} + | awk '{TOT += $2} 1; END { print "Total", TOT}'

RudiC · December 4, 2016, 10:00am

The second option. For the future: how about doing some tests and finding out by yourself?

Aia · December 4, 2016, 11:48am

If your grep supports the P flag for Perl comparability regex then you can just make this little adjustment:

find . -name '*txt' -exec grep -PicH '(?<!"AC_STAT") 0 ' '{}' '+' | awk -F: '{ T += $2 } 1; END { print "TOTAL:" T }