Read xml tags and then remove the tag using shell script

<Start>
<Header>
This is header section
</Header>
<Body>
<Body_start>
This is body section
<a>
<b>
<c>
<st>111</st>
</c>
<d>
<st>blank</st>
</d>
</b>
</a>
</Body_start>
<Body_section>
This is body section
<a>
<b>
<c>
<st>5</st>
</c>
<d>
<st>666</st>
</d>
</b>
<b>
<c>
<st>154</st>
</c>
<d>
<st>1457954</st>
</d>
</b>
<b>
<c>
<st>845034</st>
</c>
<d>
<st>blank</st>
</d>
</b>
</a>
</Body_section>
</body>
</start>

If 'st' value of 'c' tag is 154 , then the whole <b>to </b> tag needs to removed.Value 154 may or not be present in the file.
If the value 154 present , then the removal of the following part is needed

<b>
<c>
<st>154</st>
</c>
<d>
<st>1457954</st>
</d>
</b>

I want to do the coding in shell script. I can not use xslt because my system does not support xslt.

Is this a homework assignment?

What have you tried to solve this problem on your own?

Why does this have to be done entirely in shell? Why can't you install xslt or use something like awk ?

What operating system and shell are you using?

When you say "If the value 154 present , then the removal of the following part is needed", to what does "part" refer?

Thanks for your reply.The xml removal is part of one big shell script and the existing script is in bash and I am stuck in xml tag removal . If the 'st' value is 154, then the whole <b> tag will be removed.And the modified xml will look like

<Start>
<Header>
This is header section
</Header>
<Body>
<Body_start>
This is body section
<a>
<b>
<c>
<st>111</st>
</c>
<d>
<st>blank</st>
</d>
</b>
</a>
</Body_start>
<Body_section>
This is body section
<a>
<b>
<c>
<st>5</st>
</c>
<d>
<st>666</st>
</d>
</b>
<b>
<c>
<st>845034</st>
</c>
<d>
<st>blank</st>
</d>
</b>
</a>
</Body_section>
</body>
</start>

I'm afraid that unless you answer Don Cragun's questions in their entirety you won't get any further replies...

2 Likes