it should consider each elemet under the number column and bin all the lines like below with 100 as an interval. The third column should represent the bin range
The first element in the number column is 75 so it should bin from 75-175; second element is 160, so bin elements that occur between 160-260 ; and third element is 88 so it has to bin 88-188; fourth element is 114 so 114-214 and so on for each element in the number column.
awk '
NR==1{
next
}
NR==FNR{
B[$2]=$2+100
next
}
{
for(i in B) {
r=i "-" B
if ( i+0<=$2+0 && $2+0 < B+0 ) print $0, r > ( "bin_" r )
}
}
' OFS='\t' file file
awk '
NR==1{ # skip the header record
next
}
NR==FNR{ # when reading the file for the first time ( that is when NR equals FNR )
B[$2]=$2+100 # create a representation of the bins in the form of arrays, witch index $2 and value $2 + 100
next # do not process the rest which is meant for the second time the file is read
}
{ # process the file for the second time
for(i in B) { # for each index in the bins
r=i "-" B # compose the string that represents the bin's range
if ( i+0<=$2+0 && $2+0 < B+0 ) print $0, r > ( "bin_" r ) # if $2 is witin the bin's range then print to the corresponding file the record and the range to the corresponding file
}
}
' OFS='\t' file file # use a tab to separate the record range. Read file twice, once for the bins second for the output.
--
note: If there are too many bin files, close() statements will need to added to intermediately close file, otherwise there will be "too many files open" errors.