how to read the variable from tags based on appropriate tag

Hi,

I've got a situation where I need to read the values from XML tags in a file. Please find the sample xml code below:

<entity>
  <name>Testing</name>
  <number>11</number>
  <template>testing.testing</template>
</entity>
<entity>
  <name>Development</name>
  <number>13</number>
</entity>
<entity>
  <name>Support</name>
  <number>18</number>
  <location>US</location>
  <template>analyze.analyze</template>
</entity>

Now, i want to retrieve the <name> tag field and <template> tag value using single command. But, the <template> tag doesnot exists for some fields.

output should be:

NAME - TEMPLATE
Testing - testing
Development - 
Support - analyze

In the output, TEMPLATE column should have only first string before . (period) sign.

Please help me out!!!

Thanks in advance

This works on your sample data:

awk -F'[<>.]' '/<entity>/,/<\/entity>/{
if(/<name>/) n=$3
if(/<template>/) t=$3
if(/<\/entity>/) {print n,t;n=t=""}}' OFS=' - ' file
1 Like
awk -F "[<>.]" 'NR==1{print "Name Template"}
/<name>/{if(s){print s"-";s=$3}else{s=$3}}
/<template>/{s=s"-"$3;print s;s=""}' file
1 Like

Hi Thanks for the update.

A small update that if I don't want to read the values of <name> of <entity> in which <template> doesn't present. Could you help??

---------- Post updated at 03:26 PM ---------- Previous update was at 03:23 PM ----------

Hi Thanks you.

I don't want to read the values of <name> of <entity> in which <template> doesn't present. Could you help??

You could have mentioned that in the original post itself. Your output in that post did not require this constraint. In future, mention your requirements clearly and fully.

awk -F'[<>.]' '/<entity>/,/<\/entity>/{
if(/<name>/) n=$3
if(/<template>/) t=$3
if(/<\/entity>/) {if(t) print n,t;n=t=""}}' OFS=' - ' file
1 Like
awk -F "[<>.]" 'NR==1{print "Name Template"}
/<name>/{s=$3}
/<template>/{s=s"-"$3;print s;s=""}' file
1 Like

Hi

It works fine. Thank you. Apologize to bother you again.

I've got an another scenario to read values. Please find the below sample xml code:

<entity>
 <name>Testing</name>
 <number>13</number>
 <template>analyze.analyze</template>
 <requirement>
    <prerequire>
       <prerequireId>12</prerequireId>
       <prerequire>439</prerequire>
   </prerequire>
  </requirement>
</entity>
<entity>
 <name>Support</name>
 <number>19</number>
 <template>testing.testing</template>
 <requirement>
    <prerequire>
       <prerequireId>10</prerequireId>
       <prerequire>9382</prerequire>
   </prerequire>
    <prerequire>
       <prerequireId>10</prerequireId>
       <prerequire>9382</prerequire>
   </prerequire>
  </requirement>
</entity>
<entity>
 <name>Development</name>
 <number>14</number>
 <template>testing.testing</template>
 <requirement>
    <prerequire>
       <prerequireId>11</prerequireId>
       <prerequire>1928</prerequire>
   </prerequire>
    <prerequire>
       <prerequireId>12</prerequireId>
       <prerequire>1458</prerequire>
   </prerequire>
    <prerequire>
       <prerequireId>9</prerequireId>
       <prerequire>9894</prerequire>
   </prerequire>
  </requirement>
</entity>

The output should display as below:

In the above sample xml code, the number of <prequire> tag may vary or may not exists.
Please help!!!!

---------- Post updated at 05:12 PM ---------- Previous update was at 05:11 PM ----------

Apologize to bother you again :slight_smile:

I've got an another scenario to read values. Please find the below sample xml code:

<entity>
 <name>Testing</name>
 <number>13</number>
 <template>analyze.analyze</template>
 <requirement>
    <prerequire>
       <prerequireId>12</prerequireId>
       <prerequire>439</prerequire>
   </prerequire>
  </requirement>
</entity>
<entity>
 <name>Support</name>
 <number>19</number>
 <template>testing.testing</template>
 <requirement>
    <prerequire>
       <prerequireId>10</prerequireId>
       <prerequire>9382</prerequire>
   </prerequire>
    <prerequire>
       <prerequireId>10</prerequireId>
       <prerequire>9382</prerequire>
   </prerequire>
  </requirement>
</entity>
<entity>
 <name>Development</name>
 <number>14</number>
 <template>testing.testing</template>
 <requirement>
    <prerequire>
       <prerequireId>11</prerequireId>
       <prerequire>1928</prerequire>
   </prerequire>
    <prerequire>
       <prerequireId>12</prerequireId>
       <prerequire>1458</prerequire>
   </prerequire>
    <prerequire>
       <prerequireId>9</prerequireId>
       <prerequire>9894</prerequire>
   </prerequire>
  </requirement>
</entity>

The output should display as below:

In the above sample xml code, the number of <prequire> tag may vary or may not exists.
Please help!!!!

awk -F "[<>.]" 'NR==1{print "Name Template Prequire1 Prequire2 Prequire3"}
/<name>/{s=$3}
/<template>/{s=s" "$3}
/<\/prerequire>/{s=s" "$3}
/<\/entity>/{print s;s=""}' file

The command is returning the result in one column but not in the format that I've provided. Please check and confirm.

No. it works perfect as per your provided input...

$ awk -F "[<>.]" 'NR==1{print "Name Template Prequire1 Prequire2 Prequire3"}
/<name>/{s=$3}
/<template>/{s=s" "$3}
/<\/prerequire>/{s=s" "$3}
/<\/entity>/{print s;s=""}' file
Name Template Prequire1 Prequire2 Prequire3
Testing analyze 439
Support testing 9382  9382
Development testing 1928  1458  9894

It's only working for the sample xml code provided. My actual code is of 2000 lines among which few <entity> tags doesn't have the <prerequisite> tag. The Command should not returns the <name> tags which doesn't have <template> within the <entity>. Could you also provide ','(comma) between each cell in the output like below:

awk -F "[<>.]" 'NR==1{print "Name Template Prequire1 Prequire2 Prequire3"}
/<name>/{n=$3}
/<template>/{t=$3}
/<\/prerequire>/ {if(NF > 4){if(s){s=s","$3}else{s=$3}}}
/<\/entity>/{if(t){print n,t,s;s="";t=""}else{s=""}}' OFS="," file
1 Like

Thanks but still the above is failing. What is the variable 's' in highlighted code above? From where does 's' value gets?

s value collected from <prerequire> this tag..

ohhh just forgot to add one more condition...
this should workkk:)

awk -F "[<>.]" 'NR==1{print "Name Template Prequire1 Prequire2 Prequire3"}
/<name>/{n=$3}
/<template>/{t=$3}
/<\/prerequire>/ {if(NF > 4){if(s){s=s","$3}else{s=$3}}}
/<\/entity>/{if(t){if(s){print n,t,s;s="";t=""}else{print n,t;t=""}}else{s=""}}' OFS="," file