Help with simple bash script involving a loop

Dear unix wizards,

I'd be very grateful for your help with the following.

I have a hypothetical file (file.txt) with three columns:

111 4 0.01 
112 3 0.02
113 2 0.03
114 1 0.04
115 1 0.06
116 2 0.02
117 3 0.01
118 4 0.05

Column 2 consists of pairs of integers from 1 to 4 (each number only occurs twice). I want to:

  • find the two lines with matching values for column 2
  • then of the two, pick out the line that has the greatest value for column 3
  • then finally print column 1 of that line.
    I want to use a looped bash script to do this, as in reality, the values in column 2 go from 1 to about 10,000.

I have tried:

#! bin/bash
for i in {1..4}
do
cat file.txt | awk '{print $2}' | grep -w "$i" | sort -k 3 | head -2 | tail -1 | awk '{print $1}'
done

In the hope that it would give me an output that looks like:

115
113
112
118

However, I'm getting nothing at all, and have clearly gravely misunderstood something here.

Please help!

Many thanks.

---------- Post updated at 12:53 PM ---------- Previous update was at 12:43 PM ----------

I'd omitted "$" before "i".
Silly mistake - sorry.

It sort of works now, but because I used awk '{print $2}' to search in that column, the final value that is printed is not from the original column 1 in the file, but from column 2 (as that is the only remaining column).

Is there a way around this?
And also, a more elegant way to script this?

Thanks.

Hello aberg,

If you are not worried about the order of the printing of 1st field then following may help you in same.
Please let me know how it goes then.

awk 'FNR==NR{A[$2]=A[$2]>$3?A[$2]:$3;next} ($2 in A){if($3==A[$2]){print $1;delete A[$2]}}'  Input_file  Input_file

Thanks,
R. Singh

1 Like
sort -n -k2 -k3 infile | awk '!a[$2]++ && l {print l} {l=$1}; END {print l}'
1 Like
awk '{if(p[$2]<$3){p[$2]=$3;c[$2]=$1;}}END{for(i in c){print c}}' example.data

Output:

113
112
118
115
1 Like

Thanks RavinderSingh13 and rdrtx1. Much appreciated.

This one prints at the first occasion in the main loop, i.e. does not need an explicit loop in the END section

awk '($2 in A3) {print ($3>A3[$2] ? $1 : A1[$2]); next} {A1[$2]=$1; A3[$2]=$3}' file.txt

Also I have the habit to check for existence first ( $2 in A3 ) so I don't need to think about unitialized fields, negative numbers, etc.