Find a string and then return the next 20 characters in multiple files

Hello all,

I have a directory with 2000+ files. I need to look in each file for an invoice number. To identify this, i can search for the string 'BIG' and then retrieve the next 30 characters. I was thinking awk for this, but not sure how to do it. Each file contains one long string and in the middle is the invoice number. If i can find the position of the 'BIG' pattern, then grab the next 30 characters, I can extrapolate the invoice number I need.

I basically need to pull out all 2000+ invoice numbers and put them in one file, one invoice number per line.

Any help is much appreciated??

SAMPLE input:

TEST FILE|USING|NEW|SYSTEM|BIG|20130924|49685234|THIS ISNT THE END|BYE

output needed:

BIG|20130924|49685234

keep in mind i need to do this to 2000+ files in one directory.

THANKS!

Jennifer

I only count 18 characters after BIG, if that's the case you can use this:

grep -oE BIG.\{18\} file

Hi,
Your demand:

grep -o 'BIG.\{1,30\}' file

But, maybe better

grep -o 'BIG|\([^|]\+|\)\{1,2\}' file

where file is a list of file ==> * for all in directory
-h option if you don't want the file name in the resultat.

Regards.

Thank you both but when i try those command, it says it doesn't recognize the -o flag?

Ok,
with sed:

sed -n 's/.*\(BIG.\{1,30\}\).*/\1/p'
sed -n 's/.*\(BIG|\([^|]\+|\)\{1,2\}\).*/\1/p'

regards.

thanks for the sed, but I am not sure how to use that with a list of files?

as explain for grep:

sed .... *

Regards.

wonderful, it works perfectly!!!

You may want to know the filename attached to that invoice number, and the invoice number may be less than, or even more than 8 chars. Try this:

awk '{for (i=1;  i<=NF; i++) if ($i=="BIG") print FILENAME, ": ", $i, $(i+1), $(i+2)}' FS="|"  *
file :  BIG 20130924 49685234

In case there's only one InvNo per file, and your awk has the "nextfile" command, you may want to add the nextfile to the end of the script line.