Only print the entries with the highest number?

Just want to say this is great resources for all thing Unix!!

cat tmp.txt
A 3
C 19
A 2
B 5
A 1
A 0
C 13
B 9
C 1

Desired output:

A 3
B 9
C 19

The following work but I am wondering if there is a better way to do it:
Looking for better performance on larger and more complex file.

awk '{print $1}' tmp.txt | sort | uniq > grep.lst
sort -k1,1 -k2,2nr tmp.txt > tmp.sort.txt
for i in `cat grep.lst`
do
 grep $i tmp.sort.txt | head -1 >> good.out
done
cat good.out

A 3
B 9
C 19

Thanks in advance.

David

awk '$2>M[$1]{M[$1]=$2}END{for (i in M) print i,M}' file
1 Like

Thanks!! That work great.

awk '$2>M[$1]{M[$1]=$2}END{for (i in M) print i,M}' file

I will go RTFM on Awk :slight_smile: but would love to be taught how to fish instead of given say fish :slight_smile:

Put the 2nd column Data into Hash-Array? Since it's Using Column 1 as index?

{M[$1]=$2}

This part I think I understand:
For each element print the index(which is data for column 1) then the value of hash-array:

for (i in M) print i,M)

I am missing the part how awk put the highest value in the the array?

Thanks
David

sort -k1,1 -k2,2nr infile | awk '!A[$1]++'

$2>M[$1] - This part is testing if current value in column two is bigger than maximum value saved so far. If it is, then the value is replaced with the code that you described.