Reading and copying a selected rows

Dear All,

I have a data file input.res like below. (Only six column shown here for example.)
Sequence of first column starting from 1 to 148.

Input file

1 Q0 9_August_2014_Entertainment2 0 20.14967806339729 BM25b1.0
1 Q0 13_October_2012_Page323 1 20.134224346765738 BM25b1.0
1 Q0 3_April_2014_Entertainment4 2 19.99980848390178 BM25b1.0
1 Q0 13_December_2012_Page324 3 19.49552258200046 BM25b1.0
1 Q0 18_December_2012_Page331 4 19.379322689633636 BM25b1.0
1 Q0 9_February_2013_International1 5 19.37324470193972 BM25b1.0
2 Q0 31_October_2012_MuslimWorld9 0 16.50230172618038 BM25b1.0
2 Q0 31_October_2012_ 0 1 16.244196279676054 BM25b1.0
2 Q0 16_November_2012_ 12 2 16.204816883686515 BM25b1.0
2 Q0 31_October_2012_ 1 3 15.947697590184493 BM25b1.0
2 Q0 31_October_2012_MuslimWorld8 4 15.811735282661466 BM25b1.0
2 Q0 9_November_2012_LocalNews5 5 14.906807130618539 BM25b1.0
...........
...........
...........
148 Q0 3_January_2012_Page39 0 13.227415420268592 BM25b1.0
148 Q0 6_March_2012_Front1 1 13.023359161460377 BM25b1.0
148 Q0 3_June_2014_Front13 2 12.969833413025505 BM25b1.0
148 Q0 27_March_2012_Page36 3 12.687908813980718 BM25b1.0
148 Q0 11_March_2013_Front10 4 12.668886361211987 BM25b1.0
148 Q0 29_June_2014_Page30 5 12.607198882770502 BM25b1.0

I want to select first four lines for each number and copying them to a seperate new files.

file1

1 Q0 9_August_2014_Entertainment2 0 20.14967806339729 BM25b1.0
1 Q0 13_October_2012_Page323 1 20.134224346765738 BM25b1.0
1 Q0 3_April_2014_Entertainment4 2 19.99980848390178 BM25b1.0
1 Q0 13_December_2012_Page324 3 19.49552258200046 BM25b1.0

file2

2 Q0 31_October_2012_MuslimWorld9 0 16.50230172618038 BM25b1.0
2 Q0 31_October_2012_ 0 1 16.244196279676054 BM25b1.0
2 Q0 16_November_2012_ 12 2 16.204816883686515 BM25b1.0
2 Q0 31_October_2012_ 1 3 15.947697590184493 BM25b1.0

........
........

file148

148 Q0 3_January_2012_Page39 0 13.227415420268592 BM25b1.0
148 Q0 6_March_2012_Front1 1 13.023359161460377 BM25b1.0
148 Q0 3_June_2014_Front13 2 12.969833413025505 BM25b1.0
148 Q0 27_March_2012_Page36 3 12.687908813980718 BM25b1.0

thanks in advance

Try something like this:

awk '{f="file" $1} p!=f{if(p)close(f); p=f} ++C[$1]<=4{print >f}' file

The input file needs to grouped on field 1 for this to work. If that is not the case, it needs to be sorted first..
Closing the files is necessary, otherwise awk will have too many files open at once...

2 Likes

Hello mranrasheedamu,

Could you please try following and let me know if this helps you.

 awk '{if(++Q[$1]<5){print $0 >> "file"$1} else {close("file"$1)}}'  Input_file
 

Output it created 2 files named file1 and file2 , because have put lines into Input_file only till 2 number only.

Thanks,
R. Singh

2 Likes