How can I parse xml file?

How can I parse file containing xml ?
I am sure that its best to use perl - but my perl is not very good - can someone help?

Example below contents of file containing the xml - I basically want to parse the file and have each field contained in a variable..

ie. I want to store the account number in a variable, name in a variable, add in a variable

So I could just echo $accountnumb $name $add etc and get the following
65004 Bob Daly Ireland

XML Sample File

<?xml version="1.0"?>

<po:Message
xmlns:po="http://192.168.50.167/cust/api"
xmlns="http://192.168.50.167/cust"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://192.168.50.167/cust/api http://database1.com:1008/cust/api/jsp/xsd/api\_v3_xsd.jsp"
>
<po:Response>

    &lt;po:Header&gt;
        &lt;po:RequestUrl&gt;&lt;![CDATA[/cust/api/ACCOUNTDetail]]&gt;&lt;/po:RequestUrl&gt;
        &lt;po:RequestCommand&gt;ACCOUNT_DETAILS&lt;/po:RequestCommand&gt;
        &lt;po:Version&gt;6&lt;/po:Version&gt;
        &lt;po:Id&gt;000001&lt;/po:Id&gt;
        &lt;po:Opid&gt;&lt;![CDATA[XXX]]&gt;&lt;/po:Opid&gt;
        &lt;po:Status&gt;
            &lt;po:Response&gt;SUCCESS&lt;/po:Response&gt;
            &lt;po:Text&gt;
                &lt;![CDATA[Service successfully completed.]]&gt;
            &lt;/po:Text&gt;
            &lt;po:StatusKey&gt;&lt;![CDATA[SERVICE_SUCCESS]]&gt;&lt;/po:StatusKey&gt;
            &lt;po:StatusParams cACCOUNT="0"&gt;
            &lt;/po:StatusParams&gt;
        &lt;/po:Status&gt;
    &lt;/po:Header&gt;

    &lt;po:Body&gt;
        &lt;po:RequestKey&gt;
            &lt;po:RequestCust&gt;65004&lt;/po:RequestCust&gt;
        &lt;/po:RequestKey&gt;

        &lt;ACCOUNT&gt;
            &lt;ACCOUNT&gt;65004&lt;/ACCOUNT&gt;
            &lt;MasterACCOUNT&gt;65004&lt;/MasterACCOUNT&gt;
            &lt;SubscriberType&gt;SUBS\_TYPE_STANDALONE&lt;/SubscriberType&gt;
	&lt;Name&gt;Bob Daly&lt;/Name&gt;
	&lt;Add&gt;Ireland&lt;/Add&gt;

            &lt;ClassDescs&gt;
                &lt;ClassDesc&gt;&lt;![CDATA[Social Account]]&gt;&lt;/ClassDesc&gt;
                &lt;TempClassDesc&gt;&lt;![CDATA[]]&gt;&lt;/TempClassDesc&gt;                        
            &lt;/ClassDescs&gt;
            &lt;ClassChangedDateTime&gt;&lt;/ClassChangedDateTime&gt;

            &lt;PreferredLanguage&gt;&lt;![CDATA[EN]]&gt;&lt;/PreferredLanguage&gt;
            &lt;PreferredCurrency&gt;&lt;![CDATA[EUR]]&gt;&lt;/PreferredCurrency&gt;
            &lt;CurrentPromotionPlan&gt;&lt;![CDATA[]]&gt;&lt;/CurrentPromotionPlan&gt;
            &lt;SubscriptionStatus&gt;ACTIVE&lt;/SubscriptionStatus&gt;

            &lt;TempStatus&gt;NOT_BLOCKED&lt;/TempStatus&gt;
            &lt;EocnSelStructId&gt;255&lt;/EocnSelStructId&gt;

            &lt;Agent&gt;&lt;![CDATA[00000000]]&gt;&lt;/Agent&gt;
            &lt;SubAgent&gt;&lt;![CDATA[]]&gt;&lt;/SubAgent&gt;
            &lt;DisconnectReason&gt;&lt;![CDATA[]]&gt;&lt;/DisconnectReason&gt;

            &lt;DisconReasonText&gt;&lt;![CDATA[]]&gt;&lt;/DisconReasonText&gt;

            &lt;BeginDate&gt;16-Jul-2008&lt;/BeginDate&gt;

            &lt;StartDate&gt;16-Jul-2008&lt;/StartDate&gt;

            &lt;ServiceRemovalDate&gt;12-Sep-2009&lt;/ServiceRemovalDate&gt;

            &lt;LastModification&gt;12-Sep-2008 13:24:33&lt;/LastModification&gt;




        &lt;/ACCOUNT&gt;
    &lt;/po:Body&gt;

&lt;/po:Response&gt;

</po:Message>

An example to read those variables in a shell script with awk, adjust it if you want more fields:

#/bin/sh

awk -F"<|>" '
$2=="ACCOUNT" && NF > 3{s=S3}
$2=="Name"{s=s" "S3}
$2=="Add"{s=s" "$3}
$2=="/ACCOUNT"{print s}
' file | 
while read accountnumb name add; do
  echo "$accountnumb" "$name" "$add"
done
# do something with "$accountnumb" "$name" "$add"
# more commands..

Regards

Thanks for reply - I entered the xml in to a file called file..
Then ran the script you provided but it just returned to command prompt with no output.... Do you know why?

Use nawk, gawk or /usr/xpg4/bin/awk on Solaris.
Do you get any output with this?

#/bin/sh

awk -F"<|>" '
$2=="ACCOUNT" && NF > 3{s=S3}
$2=="Name"{s=s" "S3}
$2=="Add"{s=s" "$3}
$2=="/ACCOUNT"{print s}
' file

Okay - I now specified to use
/usr/xpg4/bin/awk

I get the following output only when I run the latest script..

Ireland

hmmm - I made a small change to the script and now I get

Bob Daly Ireland

I changed from :
$2=="ACCOUNT" && NF > 3{s=S3}
$2=="Name"{s=s" "S3}
$2=="Add"{s=s" "$3}

to:
$2=="ACCOUNT" && NF > 3{s=S3}
$2=="Name"{s=s" "$3}
$2=="Add"{s=s" "$3}

I dont know much about awk to know why this worked and what to do to get the ACCOUNT to work as well... Can you explain what each row of code in the script is actually doing?

Sorry for the typos, S3 should be $3.
Try this, I've change the internal field separator to split the fields properly if you have spaces in the variables:

#/bin/sh

OFS=$IFS
IFS=:

/usr/xpg4/bin/awk -F"<|>" '
$2=="ACCOUNT" && NF > 3{s=$3}
$2=="Name"{s=s":"$3}
$2=="Add"{s=s":"$3}
$2=="/ACCOUNT"{print s}
' file |
while read accountnumb name add; do
  echo "$accountnumb" "$name" "$add"
done

IFS=$OFS

# Here you can use the variables $accountnumb" "$name" and "$add"
.
.
exit 0

Thanks for help to date - its been very useful

One more snag... another possible xml file I have to parse contains something like this..

I need to get the Acc total for each sub account id..
ie. there are 3 sub accounts but the tags are the same... what can I do here?

          &lt;RecSubaccs&gt;

                   &lt;RecSubacc&gt;
                     &lt;SubaccId&gt;1&lt;/SubaccId&gt;
                     &lt;RecAccTotal&gt;0&lt;/RecAccTotal&gt;
                   &lt;/RecSubacc&gt;

                   &lt;RecSubacc&gt;
                     &lt;SubaccId&gt;2&lt;/SubaccId&gt;
                     &lt;RecAccTotal&gt;0&lt;/RecAccTotal&gt;
                   &lt;/RecSubacc&gt;

                   &lt;RecSubacc&gt;
                     &lt;SubaccId&gt;3&lt;/SubaccId&gt;
                     &lt;RecAccTotal&gt;0&lt;/RecAccTotal&gt;
                   &lt;/RecSubacc&gt;

          &lt;/RecSubaccs&gt;

You could also use CPAN (XML::pick_one): Stepping up from XML::Simple to XML::LibXML

awk -v v=SubaccId -F'[<|>]' '$2==v{s=$3;getline;a+=$3}END {for (i in a)print v,i,a}'   file

Next time please start a new topic for your new question.

I asked this question in same thread as I need to do this in the same script and its a sub-question..

Can you tell me how I do this within the same script so I can still get the name, add etc but also get the subtotals for each of the sub accounts...

ie. I need to be able to echo account, name, add, subtotal1, subtotal2 and subtotal3

How can I do this?

This code worked for me - can you explain the awk code and what its doing at each step so I can understand this?

Can you tell me how I need to modify this if there are more fields in the xml response?

ie. if there was additional information in a different response as below in red what do I need to change in the awk code?

<RecSubaccs>

<RecSubacc>
<SubaccId>1</SubaccId>
<RecAccTotal>0</RecAccTotal>
<RedAccType>Perm</RedAccType>
</RecSubacc>

<RecSubacc>
<SubaccId>2</SubaccId>
<RecAccTotal>0</RecAccTotal>
<RedAccType>Perm</RedAccType>
</RecSubacc>

<RecSubacc>
<SubaccId>3</SubaccId>
<RecAccTotal>0</RecAccTotal>
<RedAccType>Temp</RedAccType>
</RecSubacc>

</RecSubaccs>

I'll try:

# cat awk.file
BEGIN{
      FS="[<|>]"     # Set Field Separator
      v="SubaccId"   # Set v value, the search pattern
      }
$2==v{               # If second field equal v...
      s=$3           # Set assign third field value to s variable
      getline        # Read next line
      a+=$3       # Build the a array with the s elements and sum of third field
      }
END{                 # At the end do....
    for (i in a)     # For each element of a array do..
    print v, i, a # Print v (static pattern) and i (array element) and i array value.
   }

and run the code as:

awk -f awk.file data.file

and please start a new topic for your new question!

Thanks again for the help and for the explanation.

I have opened a new thread for help with the additional question
http://www.unix.com/shell-programming-scripting/81195-parsing-xml-using-awk-more-help-needed.html\#post302236294