BASH script to parse XML and generate CSV

Hi All,

Hope all you are doing good! Need your help. I have an XML file which needs to be converted CSV file. I am not an expert of awk/sed so your help is highly appreciated!!

XML file looks like this:

<l:event dateTime="2013-03-13 07:15:54.713" layerName="OSB" processName="ABC" eventType="END" eventStatus="SUCCESS" eventCode="" outboundServiceName="" serverHostIP="*******" applicationID="someapp" providerID="****" originatorIP="A.B.C.D" SOAConsumerTransactionID="" SOATransactionID="2f9bbaf6-ae21-41ab-925a-998a5efd82a0"><xml:changeEligibleTariff_1Response xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xml="http://soa.o2.ie/xml__1_0"><xml:result><cor:statusCode xmlns:cor="http://soa.o2.ie/coredata_1_0">mt-22-6001-W</cor:statusCode><cor:externalDescription xmlns:cor="http://soa.o2.ie/coredata_1_0">Partial Success.The Request is under process.</cor:externalDescription></xml:result><xml:tariffCode>52</xml:tariffCode><xml:transactionId>5850034</xml:transactionId><xml:msisdn>353867877726</xml:msisdn><xml:subscriberType>PREPAY</xml:subscriberType><xml:accountNumber>229951060</xml:accountNumber><xml:channelIdentifier>SOMEAPP</xml:channelIdentifier><xml:agentIdentifier>SOMEAPP</xml:agentIdentifier></xml:changeEligibleTariff_1Response>
</l:event>

and the generated CSV looks like:

2013-03-13 07:15:54.713,2f9bbaf6-ae21-41ab-925a-998a5efd82a0,52,5850034,229951060,mt-22-6002-W,Partial Success.The Request is under process.,SOMEAPP

Thanks in advance.

Regards,
Bhaskar

You can use awk to extract required attributes and values.

Here is a code that extracts the first 3 required fields:

awk -F'[=>]' ' {
                for ( i = 1; i <= NF; i++ )
                {
                        if ( $i ~ /dateTime/ )
                        {
                                dT = $( i + 1 )
                                gsub (/\"[ ]+.*|\"/, X, dT)
                        }
                        if ( $i ~ /SOATransactionID/ )
                        {
                                SID = $( i + 1 )
                                gsub (/\">.*|\"/, X, SID)
                        }
                        if ( $i ~ /<xml:tariffCode/ )
                        {
                                tC = $( i + 1 )
                                gsub (/<.*/, X, tC)
                        }
                }
} END {
        print dT, SID, tC
} ' OFS=, xmlfile

I will leave it to you for extracting the rest.

1 Like

Hi Yoda,

Thanks a lot for your effort. That definitely will make a help.

Regards,
Bhaskar