big xml file with nested loop parse

unclecameron · December 21, 2010, 1:49pm

I have an xml file with the structure:

<tag1>
      <value1>xyx</value1>
      <value2>123</value2>
</tag1>
<tag1>
      <value1>568</value1>
      <value2>zzzzz</value2>
</tag1>

where I want to parse each data pair in the this single file, so something like:

find first tag1 data pair
      xmlstarlet can get the data (I have this working)
      put that data in a database (I have this working)
go to next tag1 data pair

I don't know if I can make an awk loop with a FS of <>, or just use something like:

awk '/tag1/, /\/tag1/' sample.xml

which works for an individual data pair, but I just don't know how to build the right loop.

anurag.singh · December 21, 2010, 2:37pm

 
awk -F'(>|<)' '/value1/,/value2/ {if($2=="value1") v1=$3; else a[v1," ",$3]++}END{for (i in a) print i}' inputXML

For above input, output is:

xyx 123
568 zzzzz