Need to Split Big XML into multiple xmls

Hi friends..
We have urgent requirement.We need to split the big xml having multiple orders into multiple xmls
having each order in each xml.
For Example
In input XMl will be in following format with multiple line orders..

<OrderDetail BillToKey="20100805337" Createuserid="CreateGuestOrder" >
<Order Number="1">
<OrderLine CarrierServiceCode="G2" FulfillmentType="ShipToHome"/>
<Item ItemDesc="iPearl 8GB MP3 " ItemID="11239924" ItemWeight="0.00" />
<LinePriceInfo ActualPricingQty="1.00" BundleTotal="0.00" DiscountPercentage="0.00" LineTotal="80.79" />
</Order>
<Order Number="2">
<OrderLine CarrierServiceCode="H2" FulfillmentType="ShipToHome" />
<Item ItemDesc="TV" ItemID="112345424" ItemWeight="67.00" />
<LinePriceInfo ActualPricingQty="1.00" BundleTotal="0.00" DiscountPercentage="0.00" LineTotal="80.79" />
</Order>
<Order Number="3">
<OrderLine CarrierServiceCode="M2" FulfillmentType="ShipToHome" />
<Item ItemDesc="TV" ItemID="4545345" ItemWeight="67.00" />
<LinePriceInfo ActualPricingQty="6.00" BundleTotal="0.00" DiscountPercentage="0.00" LineTotal="80.79" />
</Order>
</OrderDetail>

Output should:
Here we want to split this xml into 3 xmls.
Ist xml should contain Order Number=1 and 2nd xml will have Order Number=2 and so on...
Thanks & Regards
Prakash

1 Like

Try this,

awk '/Order Number=\"/ {a=$0;i=0;while(i<4){getline;a=a"\n"$0;++i};print a > ++j".xml"}' input.xml
1 Like

HI Pravin..
thanks for quick reply...
i executed your command...where im getting 3 xmls...
but ...</Order> tag is not coming in 2nd and 3rd xml ,but coming in 1st xml.
could you plz llook into once again.......

1 Like

Its working fine at my end.

try this,

awk '/Order Number=\"/ {a=$0;i=0;while(i<4){getline;a=a"\n"$0;i++};print a > ++j".xml"}' input.xml
1 Like

thanks a lot pravin..its working.....

---------- Post updated at 07:24 AM ---------- Previous update was at 06:38 AM ----------

Hi Pravin
we are not able to generate our real time Order xml with ur command..
Below dot shows n no of lines....
and we need to capture data between <Order Number="1"> tag to </Order> tag for different orders
and also we need to add other tags like than <Order> to each xml
for example:
here PaymentMethods and LineCharge tag data will be needed to add in each xml.

<OrderDetail BillToKey="20100805337" Createuserid="CreateGuestOrder" >
<Order Number="1">
<OrderLine CarrierServiceCode="G2" FulfillmentType="ShipToHome"/>
<Item ItemDesc="iPearl 8GB MP3 " ItemID="11239924" ItemWeight="0.00" />
.....
.
....
....
..

</Order>
<Order Number="2">
<OrderLine CarrierServiceCode="H2" FulfillmentType="ShipToHome" />
....
....
...
</Order>
<Order Number="3">
<OrderLine CarrierServiceCode="M2" FulfillmentType="ShipToHome" />
<Item ItemDesc="TV" ItemID="4545345" ItemWeight="67.00" />
<LinePriceInfo ActualPricingQty="6.00" BundleTotal="0.00" DiscountPercentage="0.00" LineTotal="80.79" />
....
......
.........
............
</Order>

<PaymentMethods>
<PaymentMethod HoldType="ADDRESS_VAL_HOLD" Status="1100" StatusDescription="Created" />
...
.....
</PaymentMethods>
<LineCharge ChargeAmount="0.00" ChargeCategory="SHIPPING" ChargeName="ShippingCharge" />
</OrderDetail>
1 Like

Try this Perl script-- parse.pl,

#!/usr/bin/perl

use strict;
my $ord_flg;
my $paymnt_flag;
my $str;

open (FH,"<","/path/to/ur/inputxmlfile") || die "can not open file\n";

while (<FH>) {
if (/\<PaymentMethods\>/) {
       $str=$_;
       $paymnt_flag=1;
       next;
       }
if (/\<\/PaymentMethods\>/) { $str=$str.$_; $paymnt_flag = 0;}
if ($paymnt_flag == 1 && $_ !~ /\<\/PaymentMethods\>/) {
$str=$str.$_;
}
if (/<LineCharge/) { $str=$str.$_;}
}

close(FH);

open (FH,"<","/path/to/ur/inputxmlfile") || die "can not open file\n";
my $i=1;
my $filename;

while (<FH>) {
if (/\<Order Number=\"/) {
        $filename=$i."\.xml";
        open (FW,">","/path/output/$filename") || die "Can not create file for write\n";
        print FW $_;
        $ord_flg=1;
        $i++;
        next;
       }
if (/\<\/Order\>/) { print FW "$_$str"; $ord_flg = 0;close(FW);}
if ($ord_flg == 1 && $_ !~ /\<\/Order\>/) {
print FW $_;
}
}
close(FH);

Invocation

perl parse.pl
2 Likes
awk 'NR==1{x=$0;next}/<\/Order>/{print y RS $0 RS "</OrderDetail>">f}/Order Number/{f="file"++n".xml";y=x}{y=y RS $0}' file.xml
1 Like

Here is an XSLT 2.0 stylesheet which does what you want using the result-document element.

<xsl:stylesheet version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

   <xsl:output method="xml" indent="yes"/>

   <xsl:template match="OrderDetail">
     <xsl:apply-templates select="Order" />
   </xsl:template>

   <xsl:template match="Order">
      <xsl:result-document method="xml" href="Order_{@Number}.xml">
          <xsl:element name='{name()}'>
          <xsl:attribute name="Number" select='@Number' />
          <xsl:copy-of select="child::*"/>
          <xsl:copy-of select="../PaymentMethods"/>
          <xsl:copy-of select="../LineCharge"/>
          </xsl:element>
      </xsl:result-document>
   </xsl:template>

</xsl:stylesheet>
my ($header,$footer,$flag);
while(<DATA>){
  if ($. == 1 && /<(\S+)/){
    $header=$_;
    $footer="</$1>";
    next;
  }
  if(/<Order\s+Number="(\d+)">/){
    $file="Order_Number_$1.txt";
    if ($flag==1){
      print FH $footer or die "File $file not exists";
      close FH or die "Can not close file $file";
    }
    open FH,"+>$file" or die "can not open file $file";
    print FH $header;
    $flag=1;
  }
  print FH $_;
}
close FH if FH;
__DATA__
<OrderDetail BillToKey="20100805337" Createuserid="CreateGuestOrder" >
<Order Number="1">
<OrderLine CarrierServiceCode="G2" FulfillmentType="ShipToHome"/>
<Item ItemDesc="iPearl 8GB MP3 " ItemID="11239924" ItemWeight="0.00" />
<LinePriceInfo ActualPricingQty="1.00" BundleTotal="0.00" DiscountPercentage="0.00" LineTotal="80.79" />
</Order>
<Order Number="2">
<OrderLine CarrierServiceCode="H2" FulfillmentType="ShipToHome" />
<Item ItemDesc="TV" ItemID="112345424" ItemWeight="67.00" />
<LinePriceInfo ActualPricingQty="1.00" BundleTotal="0.00" DiscountPercentage="0.00" LineTotal="80.79" />
</Order>
<Order Number="3">
<OrderLine CarrierServiceCode="M2" FulfillmentType="ShipToHome" />
<Item ItemDesc="TV" ItemID="4545345" ItemWeight="67.00" />
<LinePriceInfo ActualPricingQty="6.00" BundleTotal="0.00" DiscountPercentage="0.00" LineTotal="80.79" />
</Order>
</OrderDetail>