Count specific character(s) very large file

dcfargo · July 1, 2008, 8:58am

I'm trying to count the number of 2 specific characters in a very large file. I'd like to avoid using gsub because its taking too long.

I was thinking something like:

awk '-F[X,Y]' { t += NF - 1 } END {print t}' infile > outfile

which isn't working

Any ideas would be great.

vgersh99 · July 1, 2008, 9:04am

awk -F'(X|Y)' '{ t += (NF - 1) } END {print t}' infile > outfile

radoulov · July 1, 2008, 9:06am

This should work:

awk -F'[XY]' 'NF>1{t+=NF-1}END{print t}' input

You should use nawk or /usr/xpg4/bin/awk on Solaris.

Or try Perl:

perl -nle'$t+=tr/[XY]//;print $t if eof' input

dcfargo · July 1, 2008, 9:25am

Thank you so much. Working like a charm and finishing is a few minutes instead of half a day like the gsub version was doing.

:)