Splitting a file into 4 files containing the same name pattern

Hello,

I have one file which is in size around 20 MB , wanted to split up into four files of each size of 5 MB.

ABCD_XYZ_20130302223203.xml.

Requirement is that to write script which should do as : first three file should be of size 5 MB each, the fourth one content should be in the last file whatever it may be less or more than 5 MB.

Appreciate your help in this.

split -b 5M ABCD_XYZ_20130302223203.xml

ajju,

you want create only four files ?
I mean if file size is 23 MB, do you want four files (last one should contain everything after 15 MB) or five files

file1=5MB
file2=5MB
file3=5MB
file4=5MB
file5=3MB

oR

file1=5MB
file2=5MB
file3=5MB
file4=8MB

Both the solutions are expected...

primarily four files like with the same naming conventions as original appending 1,2,3 as suffix......

File1=5MB
file2= 5MB
file3= 5Mb
file4=8MB

filename="ABCD_XYZ_20130302223203.xml"
split -b 5M -a2 -d $filename "${filename}_"

Pravin,
Thanks for your input, but Its AIX -d not working:(

any other options?

---------- Post updated at 05:54 AM ---------- Previous update was at 03:17 AM ----------

cant we do it this thing by awk?

You could do this with awk, but it really isn't the right tool for this job.

The following seems to do what you want and should work on AIX:

#!/bin/ksh
file="ABCD_XYZ_20130302223203.xml"
base="${file%.*}_part"
ext=".${file##*.}"
m5=$((5*1024*1024))
for i in 1 2 3 4
do	if [ $i -gt 1 ]
	then	skip="skip=$((i - 1))"
	else	skip=""
	fi
	if [ $i -lt 4 ]
	then	count="count=1"
	else	count=""
	fi
	of="$base$i$ext"
	printf "Creating %s\n" "$of"
	dd if="$file" of="$of" bs=$m5 $count $skip
done

With m5 set to 5 instead of 5242880 and a much smaller file for testing, you might get output something like:

Creating ABCD_XYZ_20130302223203_part1.xml
1+0 records in
1+0 records out
5 bytes transferred in 0.000044 secs (113360 bytes/sec)
Creating ABCD_XYZ_20130302223203_part2.xml
1+0 records in
1+0 records out
5 bytes transferred in 0.000036 secs (138884 bytes/sec)
Creating ABCD_XYZ_20130302223203_part3.xml
1+0 records in
1+0 records out
5 bytes transferred in 0.000027 secs (185589 bytes/sec)
Creating ABCD_XYZ_20130302223203_part4.xml
68+1 records in
68+1 records out
341 bytes transferred in 0.000303 secs (1125301 bytes/sec)

leaving you with the files:

-rwxr-xr-x  2 dwc  staff  356 Apr 16 17:51 ABCD_XYZ_20130302223203.xml
-rw-r--r--  1 dwc  staff    5 Apr 16 18:00 ABCD_XYZ_20130302223203_part1.xml
-rw-r--r--  1 dwc  staff    5 Apr 16 18:00 ABCD_XYZ_20130302223203_part2.xml
-rw-r--r--  1 dwc  staff    5 Apr 16 18:00 ABCD_XYZ_20130302223203_part3.xml
-rw-r--r--  1 dwc  staff  341 Apr 16 18:00 ABCD_XYZ_20130302223203_part4.xml
1 Like

Dear Don,

Thank you for a very tricky solution but the splitted files are not getting process in the system because of the half of the tags are going in other files and some in others.

So here I need to splitt this file on content basis with addition of opening and closing tags in each splitted files.
for e.g
Original file having Opening tags...

<?xml version="1.0" encoding="UTF-8"?>
<ns0:ABCFile xmlns:ns0="urn:PQR:OTHERS:WXYZ:HELLOTEST">
<ABCFileHeader>
<RecordType>01</RecordType>
<Date>20140405</Date>
<TotalRecord>46048</TotalRecord>  // 46048/4 = 11512 records in each file 
</ABCFileHeader>
.
.

Actualrecord ....starts like

<ABRecordDetail>
<RecordType>02</RecordType>
<LineItem>0000000002</LineItem>
<CompanyCode>PQR</CompanyCode>
<ABDate>20130901</ABtDate>
<CurrencyKey>PVR</CurrencyKey>
<AmountInDC>0</AmountInDC>
<AmountInLC>0</AmountInLC>
<CostCenter>BBN</CostCenter>
<FType>DTH</FType>
<QNumber>VBR3581 </QNumber>
<SNumber>9kBQ</SNumber>
<VNumber>BBGRB</SNumber>
<Assignment>0945</Assignment>
</ABRecordDetail>

So the above actual 15 lines are the actual record and in original file it has 46048 such records so I wanted to split in a way that records 46048/4 = 11512 in each file in addition to opening and closing tags in each file

Opening tags.

<?xml version="1.0" encoding="UTF-8"?>
<ns0:ABCFile xmlns:ns0="urn:PQR:OTHERS:WXYZ:HELLOTEST">
<ABCFileHeader>
<RecordType>01</RecordType>
<Date>20140405</Date>
<TotalRecord>46048</TotalRecord>  // 46048/4 = 11512 records in each file so in splited file tag would be like <TotalRecord>11512</TotalRecord>
</ABCFileHeader>

Closing tag:

</ns0:ABCFile>

Hope you understood, in a simple way file needs to be splitted on content basis [record basis] i.e 15 line just need to add fixed tags at top and bottom of each file.

You didn't mention anything about XML records before. You said you wanted three files that had a size of exactly 5MB and the remainder of the original file stored in a fourth file.

Now that you have a completely different problem, please start a new thread. And, please show us what you have tried to do to solve this new problem when you open your new thread.

This thread is now closed.