Grep Issue

<record>
<set>
<termId>1234</termId>
<termType>First</termType>
</set>
<set>
<termId>5678</termId>
<termType>Second</termType>
</set>
</record>

This is saved in record.xml

Hi

I have this sample XML that i am grepping using a shell program.

The objective of the task is - based on the <termType>, its corresponding <termId> should be grepped.

So, to get the value of <termId> 5678 when <termType> is Second.

However, I am getting 1234 along with 5678

How can I get only 5678 as desired output

This is how I am doing it in shell script

egrep "<termId>|<termType>Second" record.xml

I want this to just show 5678 and not 1234

Please advise with suggestion

Thanks a lot

Hi
I dont think you can get this using egrep. Try the one shown below:

sed -ne '/<termType>Second/{x;1!p;}' -e h record.xml

First of all - Thanks a ton here..

I can see it working with "Second" hard coded - However, if i have it passed as a variable, I am not getting the results back

[b]Tried these -

pattern="Second";
sed -ne '/<termType>$pattern/{x;1!p;}' -e h record.xml

sed -ne '/<termType>"$pattern"/{x;1!p;}' -e h record.xml

sed -ne '/<termType>{$pattern}/{x;1!p;}' -e h record.xml

You can use the grep (or egrep) -B argument to to look one line above the search string and only search for <termType>Second". The downside is that you get the "<termType>Second</termType>" line and you didn't specify whether that's ok or not.

# echo '<record>
<set>
<termId>1234</termId>
<termType>First</termType>
</set>
<set>
<termId>5678</termId>
<termType>Second</termType>
</set>
</record>' | egrep -B 1 "<termType>Second"
<termId>5678</termId>
<termType>Second</termType>
# 

Anytime you start embedding values into a regular expression, you need to keep in mind that the regular expression will break if special characters are present and unescaped. In cases like these, if possible, it's best to use simple string comparisons when dealing with such values (k==$2 in the code that follows).

My personal preference would be:

$ key=Second
$ awk -F'</?[^>]*>' '/^<termId>/ {id=$2} /^<termType>/ && k==$2 {print id; exit}' k="$key" record.xml 
5678

Regards,
Alister

Hi All Unix Guru's

Thanks a ton here.. both above solutions work here - Thanks to alister and fubaya - You are awesome - appreciate feedback from other contributors on this thread

The problem seems to be resolved

For your question,

Tried these -

pattern="Second";
sed -ne '/<termType>$pattern/{x;1!p;}' -e h record.xml

sed -ne '/<termType>"$pattern"/{x;1!p;}' -e h record.xml

sed -ne '/<termType>{$pattern}/{x;1!p;}' -e h record.xml

You could have overcome this by using like this:

sed -ne '/<termType>'$pattern'/{x;1!p;}' -e h record.xml

Thanks
Guru.

sed -n "h;n;/termType>$pattern/{g;s/<termId>\(.*\)<\/termId>/\1/p}" record.xml

cheers,
Devaraj Takhellambam