awk comparison

Hello all,

Probably a very simple question, I am stuck with a small part of a code:

I am trying to do a comparison to get the maximum value of column 6 if columns 1, 4 and 5 of two or more rows match. Here is what I am doing:

awk -F'\t' '{if ($6 > a[$1"\t"$4"\t"$5])a[$1"\t"$4"\t"$5]=$6}END{for (i in a) print i"\t"a}'

But in the output I want to include the column 2 and 3 values for which column 6 value was the highest.

Important: All fields are tab separated.
For example,

111 aaa bbb ccc ddd 0.9
111 XYZ PQR ccc ddd 0.5
111 xyz pqr ccc ddd 0.7

will give

111 aaa bbb ccc ddd 0.9 

Please suggest a fix to my solution or any solution that works.

Save the whole line, or the fields you want, when you save the maximum.

awk -F'\t' '
{
   if( $6 > a[$1"\t"$4"\t"$5] )
   {
      a[$1"\t"$4"\t"$5] = $6  
      output[$1"\t"$4"\t"$5] = $0;   # saves the whole line 
  }
}
END{
   for (i in a) 
       print output;
}'
1 Like
# cat file
111     aaa     bbb     ccc     ddd     0.9
111     XYZ     PQR     ccc     ddd     0.5
111     xyz     pqr     ccc     ddd     0.7
# awk 'BEGIN{FS=OFS="\t"}{x=$1FS$4FS$5;if(!y[x]||y[x]<$6){y[x]=$6;z[x]=$0}}END{for(i in z){print z}}' file
111     aaa     bbb     ccc     ddd     0.9

agama's code has some problems, idea is right

 awk -F'\t' '{if ($6 > a[$1"\t"$4"\t"$5]){a[$1"\t"$4"\t"$5]=$6;b[$1"\t"$4"\t"$5]=$0 }}END{for (i in a) print b}' $FILE
1 Like

Serves me right for not running it. Thanks.