Split a file based on pattern in awk, grep, sed or perl

Hi All,

Can someone please help me write a script for the following requirement in awk, grep, sed or perl.

Buuuu xxx bbb
Kmmmm rrr ssss uuuu
Kwwww zzzz ccc
Roooowwww eeee
Bxxxx jjjj dddd
Kuuuu eeeee nnnn
Rpppp cccc vvvv cccc
Rhhhhhhyyyy tttt
Lhhhh rrrrrssssss
Bffff mmmm iiiii
Ktttt eeeeeee
Kyyyyy iiiii wwww
Rwwww rrrr sssss eeee
Rnnnnn xxxxxxccccc

I like to split the above file into 3 files like below,

file1:

Buuuu xxx bbb
Kmmmm rrr ssss uuuu
Kwwww zzzz ccc
Roooowwww eeee

file2:

Bxxxx jjjj dddd
Kuuuu eeeee nnnn
Rpppp cccc vvvv cccc
Rhhhhhhyyyy tttt
Lhhhh rrrrrssssss

file3:

Bffff mmmm iiiii
Ktttt eeeeeee
Kyyyyy iiiii wwww
Rwwww rrrr sssss eeee
Rnnnnn xxxxxxccccc

Basically the file need to be start with "B" record and start a new file when it come across another "B" record.

Appreciate you help.

Thanks

Kumar

With (g)awk

awk 'BEGIN{RS="\n?B"} (NR-1){print "B" $0 > ("output-file_" NR)}' input-file

Ripat, your solution doesn't give the desired output...

Try this:

awk '/^B/{close("file"f);f++}{print $0 > "file"f}' file

Regards

I guess it would something like this in Perl:

perl -n -e '/^B/ and open FH, ">output_".$n++; print FH;' file

Hi.

It is useful to know how to do custom splitting with awk, perl, etc., but one can often use utilities that are already present, such as csplit:

#!/bin/bash -

# @(#) s1       Demonstrate context splitting, csplit.

echo
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) csplit
echo

FILE=${1-data1}

# Remove debris from previous run.

rm -f xx*

if uname -a | grep SunOS
then
  csplit -k $FILE '/^B/' '{99}'
else
  csplit -z $FILE '/^B/' '{*}'
fi

echo " Samples of output files:"
for file in xx*
do
  echo
  echo "-- $file --"
  head -2 $file
done

exit 0

Producing:

$ ./s1

(Versions displayed with local utility "version")
SunOS 5.10
GNU bash 3.00.16
csplit - no version provided for /usr/bin/csplit.

SunOS vm-solaris 5.10 Generic_120012-14 i86pc i386 i86pc
0
64
89
csplit: {99} - out of range
90
 Samples of output files:

-- xx00 --

-- xx01 --
Buuuu xxx bbb
Kmmmm rrr ssss uuuu

-- xx02 --
Bxxxx jjjj dddd
Kuuuu eeeee nnnn

-- xx03 --
Bffff mmmm iiiii
Ktttt eeeeeee

See man csplit for details (Solaris man page has some examples, unlike the man page in Linux) ... cheers, drl

All solutions really worked great. Now I have a choice. Thanks. Appreciate your help.

awk '/^B/{close("file"f);f++}{print $0 > "file"f}' input.txt
perl -n -e '/^B/ and open FH, ">output_".$n++; print FH;'  input.txt
csplit -k input.txt '/^B/' '{99}'