Hello all,
Probably a very simple question, I am stuck with a small part of a code:
I am trying to do a comparison to get the maximum value of column 6 if columns 1, 4 and 5 of two or more rows match. Here is what I am doing:
awk -F'\t' '{if ($6 > a[$1"\t"$4"\t"$5])a[$1"\t"$4"\t"$5]=$6}END{for (i in a) print i"\t"a}'
But in the output I want to include the column 2 and 3 values for which column 6 value was the highest.
Important: All fields are tab separated.
For example,
111 aaa bbb ccc ddd 0.9
111 XYZ PQR ccc ddd 0.5
111 xyz pqr ccc ddd 0.7
will give
111 aaa bbb ccc ddd 0.9
Please suggest a fix to my solution or any solution that works.
agama
2
Save the whole line, or the fields you want, when you save the maximum.
awk -F'\t' '
{
if( $6 > a[$1"\t"$4"\t"$5] )
{
a[$1"\t"$4"\t"$5] = $6
output[$1"\t"$4"\t"$5] = $0; # saves the whole line
}
}
END{
for (i in a)
print output;
}'
1 Like
# cat file
111 aaa bbb ccc ddd 0.9
111 XYZ PQR ccc ddd 0.5
111 xyz pqr ccc ddd 0.7
# awk 'BEGIN{FS=OFS="\t"}{x=$1FS$4FS$5;if(!y[x]||y[x]<$6){y[x]=$6;z[x]=$0}}END{for(i in z){print z}}' file
111 aaa bbb ccc ddd 0.9
agama's code has some problems, idea is right
awk -F'\t' '{if ($6 > a[$1"\t"$4"\t"$5]){a[$1"\t"$4"\t"$5]=$6;b[$1"\t"$4"\t"$5]=$0 }}END{for (i in a) print b}' $FILE
1 Like
agama
5
Serves me right for not running it. Thanks.