My requirement is I want to split the file based on the first column.
For the first column which is having the same set of values will go to one file like that.
So in the above data
First three records will go to file 1
But the problem here is the number of record with the same value for the first column can vary.
FOr example in the above sample data I show three records with same value.
It can be either 3 or 4 or 100 or any number.Same for the other set of records also
Does the number in column #1 always comes in groups, or will you find eks 123456 further down in the file after other data?
If there are many records, files should be closed.
EDIT:
This should close the file while field #1 changes
thanks its working fine.The records will always come in group only.
But there is another issue.if we have around 77k same set of records , it will create 77k files.Actually I don't want to create that much files.I can combine the files and want to make it three or four max.But the same set of records shouldn't get split in two files.
We can use only part of the first filed to create larger groups. So if you show an example of group, we can show you how it can be done. Exs 2 first digit.
Here is en example on 2 first digit:
I can combine all the first set and second set into one file.And we can combine as many records into one file still the file size become 500000 records.
But we should take care one thing that, the same set of records shouldn't get split into two files.
The below code will first check the unique patterns in the first column and saves it to a file. it then checks for the unique pattern in the input file and stores all records matching pattern in a file named with the pattern
cut -d'|' -f1 input_file | uniq > final
while read line
do
grep "$line" input_file >> "$line".txt
done < final
once the above code is executed it results in 3 files( for the example in ques)
123456.txt
125828.txt
145679.txt
result in the file as below.
more 123456.txt
123456|ASDF|WORD|MIND|456890|40050|RTS
123456|9UIL|WORD|BLINK|15G26|43215|GTS
123456|9UIL|WORD|BLINK|15G26|43215|BTS
more 125828.txt
125828|9UIH|WIRD|BLANK|15G26|45215|NTS
125828|9UIH|WIRD|BLANK|15G26|47215|PTS
more 145679.txt
145679|8UIH|BIRD|BLINK|15T26|90807|ZTS