Getting lines between two strings with duplicate set of data

nariwithu · October 1, 2013, 11:03am

if I have the following lines in a file app.log

some lines here

<AAAA>
abc
<id>123456789</id>
ddd
</AAAA>

some lines here too

<BBBB>
abc
<id>123456789</id>
ddd
</BBBB>

some lines here too

<AAAA>
xyz
<id>987654321</id>
ssss
</AAAA>

some lines here again...

How do I get the particular response that I am interested in by providing the id ?
like if I am interested in AAAA response for id 123456789 like below

<AAAA>
abc
<id>123456789</id>
ddd
</AAAA>

I might need some changes to the command that I am using. it gives me all AAAA responses, but I need to filter the response having the id 123456789

cat app.log | sed -n '/<AAAA/,/<\/AAAA>/p'

Thanks in advance.

Narayana.V

RudiC · October 1, 2013, 11:16am

This may work, at least on your simplified samples:

awk '/<AAAA>/,/<\/AAAA>/ {X=X"\n"$0} /<\/AAAA>/ && X ~ /123456789/ {print X; X=""}' file

<AAAA>
abc
<id>123456789</id>
ddd
</AAAA>

nariwithu · October 1, 2013, 11:41am

Rudic - It is giving me all AAAA responses

RudiC · October 1, 2013, 1:24pm

Then your sample file may not be representative.

MadeInGermany · October 1, 2013, 1:53pm

X should be cleared at the end of every block

awk '/<AAAA>/,/<\/AAAA>/{X=X"\n"$0}
/<\/AAAA>/ {if (X~/123456789/) print X; X=""}' file

nariwithu · October 1, 2013, 2:03pm

Thanks MadeinGermany, It is working perfect.