Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario
My input file has data in format:

qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 
qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43  
qwerty0101CFG 12345 01022005 01022005 datainala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 
qwerty0101CFG 12342 01022005 07022009 datainalc hitalbert 43  

the records are tab separated.
I want to read the input file, based on the last three characters of the first field
qwerty0101TWE i.e. TWE I want to put this record in file TWE.txt
thennext record mxz in MXZ.txt.
Like wise all TWE records in 1 file all MXZ records in one file.
Kindly help to write shell script for same. As i'm new to unix

perl -ne '/(.{3})\t/;open O,">>$1.txt";print O;close O' file

try this..

 % awk ' { print $1 } ' input_file  | cut -c 11- | uniq | awk ' { print "grep \"[0-9]"$0"\" input_file > "$0".txt" } ' | sh
awk '{f=substr($1,length($1)-2)".txt";print > f;close(f)}' file

Thanks for the reply....

Also there are few variations plzz help

How to skip the first and last line of the file before splitting the file.

Also I want to keep the orignal file as it is

awk -v ll=$(wc -l < file) 'NR>1 && NR<ll{f=substr($1,length($1)-2)".txt";print > f;close(f)}' file > newfile

Its giving me error n I'm nt able to figure it out pl help

$ awk -v ll=$(wc -l < test) 'NR>1 &&
NR<ll{f=substr($1,length($1)-2)".txt";print > f;close(f)}' test > 
 
newfilesyntax error: `(' unexpected
$

Try:

awk -v ll=`wc -l < file` 'NR>1 && NR<ll{f=substr($1,length($1)-2)".txt";print > f;close(f)}' file > newfile

try with sed :wink:

# for((i=1;i<$(sed -n '$=' file);i++));do sed -n "$i p" file > $(sed -n "$i s/^.\{10\}\(...\).*/\1/p" file).txt;done

Use nawk on Solaris.

Franklin's code needs just a little modification:

nawk -v ll=$(wc -l < test) 'NR>1 && NR<ll{f=substr($1,length($1)-2)".txt"; print > f}' test

---------- Post updated at 11:38 PM ---------- Previous update was at 10:19 PM ----------

Hi
None of the commands are working all having same error error: `(' unexpected
Below is requirement:

txtytg09dfgdfg
qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 
qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 
qwerty0101CFG 12345 01022005 01022005 datainala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 
qwerty0101CFG 12342 01022005 07022009 datainalc hitalbert 43
byetekr 09

In the above file I have to skip the first&last line i.e not to the process these lines.
Based on the last three characters of the first field of the second record
qwerty0101TWE i.e. TWE I want to put this record in file TWE.txt
then next record mxz in MXZ.txt.
With the above queries it is creating files for first and last line also
like dgf.txt anf kr0.txt> I don't to create this files.
Also need to keep the input file as it is.
Plzz help its urgent!!!!

---------- Post updated at 11:38 PM ---------- Previous update was at 11:38 PM ----------

 
$nawk '{f=substr($1,length($1)-2)".txt";print $2,$3,$4,$5,$6 >> f;close(f)}' test
$
$ cat TWE.txt
12345 01022005 01022005 datainala alanfernanded
12342 01022005 07022009 datainalc hitalbert
$ cat mXZ.txt
12349 01022005 06022008 datainalb johngalilo
12349 01022005 06022008 datainalb johngalilo
$ cat CFG.txt
12345 01022005 01022005 datainala alanfernanded
12342 01022005 07022009 datainalc hitalbert

Sample file

$ cat test
nala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28
qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43
qwerty0101CFG 12345 01022005 01022005 datainala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28
qwerty0101CFG 12342 01022005 07022009 datainalc hitalbert 43
byetekr 09

Command:

nawk '{f=substr($1,length($1)-2)".txt";print $2,$3,$4,$5,$6 >> f;close(f)}' test

its creating files as

ala.txt -- created for first file
ekr.txt -- created for last file n other files too
Actually don't want to create files for 1st n last line
also there are n no. of records n not just 6 I don't want to hard code anything

$ cat TWE.txt
12342 01022005 07022009 datainalc hitalbert

I do want the qwerty0101mXZ field i.e first field in the output file
this command works fine

awk '{f=substr($1,length($1)-2)".txt";print > f;close(f)}' test

but want to modify cmd to skip the first and line for processing

 
$ lineno=`wc -l < test`; nawk -v lineno="$lineno" '{ if (NR>1 && NR < lineno){f=substr($1,length($1)-2)".txt";print >> f;close(f)}}' test

$ cat TWE.txt
qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43

$ cat mXZ.txt
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28

$ cat CFG.txt
qwerty0101CFG 12345 01022005 01022005 datainala alanfernanded 26
qwerty0101CFG 12342 01022005 07022009 datainalc hitalbert 43

$ cat test
nala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28
qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43
qwerty0101CFG 12345 01022005 01022005 datainala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28
qwerty0101CFG 12342 01022005 07022009 datainalc hitalbert 43
byetekr 09

Thanks for all your help n effort...Just 1 flaw.. plzz help $ lsCFG.txt mXZ.txtTWE.txt ekr.txt test after running the ls command 4 files are created.I do not want the file to be created for last line i.e ekr.txt is unwanted file which should not be created as it is last line and hence no need to process it

make sure you dont have any blank line in your main file

do

cat -n filename

if you have the last line as empty, then the provided awk command will create the file called ekr.txt

it is not flaw, it is the problem with your input file

Thanks a ton :slight_smile: I got my mistake
All working fine.....

---------- Post updated at 01:44 AM ---------- Previous update was at 01:27 AM ----------

Need 1 more favor!!!

I need to write shell script for same which will read input file i.e test from particular location n then will execute awk command on same.

Help me with same.

May be it may sound too basic... but i'm beginner in Unix

which will read input file i.e test from particular location

will the filename change everyday ?
how the filename looks like ?

 
filename="/tmp/test"
lineno=`wc -l < $filename`
nawk -v lineno="$lineno" '{ if (NR>1 && NR < lineno){f=substr($1,length($1)-2)".txt";print >> f;close(f)}}' $filename

Yes the filename will change but it will be placed at one location.

Nothing specific format for filename.

then pass the filename as argment to your script. and read using $1

filename=$1