Searching for multiple patterns in a file

Kesavan · January 10, 2011, 7:09am

Hi All,

I have a file in which i have to search for a pattern from the beginning of the file and if the pattern is found , then i have to perform a reverse search from that line to the beginning of the file to get the first occurrence of another pattern.

sample input file

hey
what are you doing
IM324
A1
iam doing mca
how about you?
IM326
B2

as like mentioned above, the files contains a lots of patterns like this.
for example i have the patttern "B2" as first pattern and if that is found , i have to get first occurence of "IM..." which is associated with it. This IM pattern may be immediately above B2 or a few lines above too

if you people have some idea, Please share with me:)

anurag.singh · January 10, 2011, 7:39am

awk '$0 ~ /^IM/ {f=$0;next} $0 ~ /^B2/ {if(f){f=f"\n"$0;print f;f="";next}} {if(f) f=f"\n"$0;}' inputFile

Scrutinizer · January 10, 2011, 7:46am

sed -n '/IM/h;/B2/{g;p;}' infile

anurag.singh · January 10, 2011, 7:52am

@Scrutinizer, I guess

sed -n '/IM/h;/B2/{H;g;p;}' infile

To print line having last pattern too.

Scrutinizer · January 10, 2011, 7:58am

That is correct, although I did not gather that this is required..

anurag.singh · January 10, 2011, 8:36am

OR may be like this:

sed -n 'H;/IM/h;/B2/{g;p;}' infile

as above one will not print lines between patterns (as shown below). The only thing with this is that, there has to be starting pattern, otherwise (if starting pattern is not found) everything upto Ending pattern will be printed

hey
what are you doing
IM324
A1
iam doing mca
how about you?
IM326
dfgdgfdg
dfgdfgfdg
B2

Kesavan · January 10, 2011, 10:46pm

Hi Anurag,

Could you please explain how it works?

anurag.singh · January 11, 2011, 1:11am

I guess you should use solution in post #6 OR in post #2.
If you are new to awk/sed, you may not understand it. U need to go through some basic tutorial 1st.

post #6:

sed -n 'H;/IM/h;/B2/{g;p;}' infile

Few meanings (Look into sed manual OR some sed tutorial for more details):
-n switch to supress default output. don't print unless explicitly requested
H = > Append the content of pattern space to the hold buffer
h = > Replace contents of hold space with the contents of the pattern space
g = > Replace contents of pattern space with the contents of the hold space
p = > Copy the pattern space to the standard output

Above command does following:

Append every line to hold space
If IM pattern found, write current line to hold space (overwrite, remove old data)
If B2 pattern found, move hold space content to pattern space and print it to standard output.

So here if input file doesn't have IM (starting) pattern, but only B2 (Ending)Pattern, everything upto ending pattern will be printed (which is not expected I believe). So this will work good if we are sure that if there is an Ending pattern in file, there is atleast one starting pattern before that.

===============================================
Post #2:

awk '$0 ~ /^IM/ {f=$0;next} $0 ~ /^B2/ {if(f){f=f"\n"$0;print f;f="";next}} {if(f) f=f"\n"$0;}' inputFile

== >> If a line has IM, store line in f
== >> If line doesn't has IM or B2, just append it to f (If f is already have some non-empty data, means IM has been found already in previous lines)
== >> If B2 is found, append it to f and print f. Set f to empty.

JavaHater · January 11, 2011, 2:37am

 awk '/IM/{h=$0};/B2/{print h"\n"$0;exit}' file