Need Help in Index

I am having data in XML format and trying to extract codes form two fields called <String>, below is the data.

<Node>tollfree<Condition>BooleanOperator<Operation>AND</Operation><Condition>BooleanOperator<Operation>NOT</Operation><Condition>FieldSelection<Field Context="ALL fields" condition="StringList" field="Other-Party-Id-Address"/><String>530101,530103,530105,509091,509092,509093,509095,530100,509099</String><Comparation>3</Comparation><MatchCase>true</MatchCase></Condition></Condition><Condition id="H_FREE">FieldSelection<Field Context="ALL fields" condition="StringList" field="Other-Party-Id-Address"/><String>53546,50878,511150,51985,58595,563636,51980,50605,50405,55501,55255,55502,56694,55436,55437,518180,54334,58888084,55511,55522,563000,55507,55530,55541,55577,58327,54300,5303020,543210,554562,55323,50325,54400,511310,56660,56789,57323,52347,544111,55500,51802,54200,542001,57878,52000,50234,55694,54422,51121,54054,554356,53111,59888,53010,51477,51144,55321,55256,51967,50909,57677,1950,55177,52409,52256,52027,161,52134,58888111,58888511,58888711</String><Comparation>0</Comparation><MatchCase>true</MatchCase></Condition></Condition><Tariff>Rate<UnitType>Service Specific</UnitType><Price>0.0<Factor>1</Factor></Price><Interval>1<Factor>1</Factor></Interval><UpdateType>Active</UpdateType></Tariff></Node>

So far I had tried to extract the data betwen first <String> i.e

530101,530103,530105,509091,509092,509093,509095,530100,509099

.
But I am unable to extract

<String>53546,50878,511150,51985,58595,563636,51980,50605,50405,55501,55255,55502,56694,55436,55437,518180,54334,58888084,55511,55522,563000,55507,55530,55541,55577,58327,54300,5303020,543210,554562,55323,50325,54400,511310,56660,56789,57323,52347,544111,55500,51802,54200,542001,57878,52000,50234,55694,54422,51121,54054,554356,53111,59888,53010,51477,51144,55321,55256,51967,50909,57677,1950,55177,52409,52256,52027,161,52134,58888111,58888511,58888711</String>

information.

and the script Iused is

awk -F"," 'BEGIN{OFS=","}
{
if ($0 ~ "<Node")
  {
  a = index($0,"<String>")
  z = index($0,"</String>")
  print substr($0, a+8, z-(a+8)) > "Voice1.txt" }
  }' Voice2.txt

Please help me in getting the next missing information.

You'll need to repeat the search:

awk -F"," '
BEGIN           {OFS=","}
$0 ~ "<Node"    {match ($0, /<String>[^<]*<\/String>/)
                 print substr ($0, RSTART+8, RLENGTH-17)
                 $0 = substr ($0, RSTART + RLENGTH)
                 match ($0, /<String>[^<]*<\/String>/)
                 print substr ($0, RSTART+8, RLENGTH-17)
                }
' file
530101,530103,530105,509091,509092,509093,509095,530100,509099
53546,50878,511150,51985,58595,563636,51980,50605,50405,55501,...

Thank you very much it's working great, and output is as desired.

Just a thought!!!

Why you need to use substr, match.. etc?

awk -F'<String>|</String>' '/<Node/{for(i=2;i<=NF;i+=2) print $i;}' file_name

-Ranga

Or try

awk -F"," '
BEGIN   {OFS=","}
/<Node/ {while (match ($0, /<String>[^<]*<\/String>/))  {print substr ($0, RSTART+8, RLENGTH-17)
                                                         $0 = substr ($0, RSTART + RLENGTH)
                                                        }
        }
' file

---------- Post updated at 13:47 ---------- Previous update was at 13:46 ----------

or a small adaption of rangarasan's porposal:

awk -F'<.?String>' '/<Node/{for(i=2;i<=NF;i+=2) print $i;}' file

Extract first String, works on OSX:

awk  '{print $2}' FS="<String>|</String>" file

Try:

$ awk '$1=="String"{print $2}' RS=\< FS=\> file
530101,530103,530105,509091,509092,509093,509095,530100,509099
53546,50878,511150,51985,58595,563636,51980,50605,50405,55501,55255,55502,56694,55436,55437,518180,54334,58888084,55511,55522,563000,55507,55530,55541,55577,58327,54300,5303020,543210,554562,55323,50325,54400,511310,56660,56789,57323,52347,544111,55500,51802,54200,542001,57878,52000,50234,55694,54422,51121,54054,554356,53111,59888,53010,51477,51144,55321,55256,51967,50909,57677,1950,55177,52409,52256,52027,161,52134,58888111,58888511,58888711

Another way:

perl -nle 'map{print} /<String>(.*?)<\/String>/g' Voice2.txt
530101,530103,530105,509091,509092,509093,509095,530100,509099
53546,50878,511150,51985,58595,563636,51980,50605,50405,55501,55255,55502,56694,55436,55437,518180,54334,58888084,55511,55522,563000,55507,55530,55541,55577,58327,54300,5303020,543210,554562,55323,50325,54400,511310,56660,56789,57323,52347,544111,55500,51802,54200,542001,57878,52000,50234,55694,54422,51121,54054,554356,53111,59888,53010,51477,51144,55321,55256,51967,50909,57677,1950,55177,52409,52256,52027,161,52134,58888111,58888511,58888711