csplit not behaving

I have a large file with the first 2 characters of each line determining the type of record. type 03 being a subheader and then it will have multiple 04 records.

eg: 03,xxx,xxxx,xxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
03,xxx,xxx,xxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx

I am looking to get N files like
file n+1
03,xxxx,xxxx,xxxx
04,xxxxxxxxxxxxxxx

file n+2

03,xxxx,xxx,xx
04,xxxxxxxxxxxxx

Using the beow script, which according the syntax of the man csplit should work (This is on HP-UX btw)

#!/bin/ksh

set -x

#This gets the occurrences of the subheader I wish to split on
awk -F"," '$2 != prev && $1=="03" && NR !=1 { print NR; prev = $2 }' MyFile > data

#This then gets the data file and transposes the line numbers to 1 305 315 398 509 515

num=$(awk -F"," 'NR==1 { print NF }' data)
print $num

i=1
while (( $i <= $num ))
do
newline=''
for val in $(cut -d" " -f$i data)
do
newline=$newline$val" "
done
nline=`print ${newline%?}`
print $nline >> tmpdata
(( i = i + 1 ))
done
mv tmpdata data

# This then gets the rows we transposed and fires the below command
rows=`cat data`
csplit 'MyFile' ${rows}

#Which looks like csplit MyFile 305 315 398 509 515
#But the split seems to split the first file at line 152?! :eek: !!!! and not 305, and then the subsequent splits are wrong :frowning:

i uderstand your problem is to split a file at every line starting with 03.

testfile:
03,xxx,xxxx,xxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
03,xxx,xxx,xxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx

i used csplit -z testfile /^03/ {} with success.
-z prevent empty files
/^03/ split at line starting with 03
{
} repeat until eof

using gnu csplit

In the end I got this working:

#This gets the occurrences of the subheader I wish to split on
awk -F"," '$2 != prev && $1=="03" && NR !=1 { print (NR*2)-1; prev = $2 }' MyFile > data

#This then gets the data file and transposes the line numbers eg: 1 305 315 398 509 515

#HPUX seems to be coming in at under 1/2 so have doubled the NR above
#num=$(awk -F"," 'NR==1 { print NF }' data)

num=$(awk -F"," 'NR==1 { print NF }' data)
print $num

i=1
while (( $i <= $num ))
do
newline=''
for val in $(cut -d" " -f$i data)
do
newline=$newline$val" "
done
nline=`print ${newline%?}`
print $nline >> tmpdata
(( i = i + 1 ))
done
mv tmpdata data

# This then gets the rows we transposed and fires the below command
rows=`cat data`
csplit 'MyFile' ${rows}

#Which looks like csplit MyFile 305 315 398 509 515