Match one column of file1 with that of file2

Hi,

I have file1 like this

aaa  
ggg
ddd
vvv
eee

and file2

aaa  2
aaa  443
xxx 76
aaa 34
ggg 33
wee 99
ggg 33
ddd 1
ddd 10
ddd 98
sds 23
vvv 32
eee 11
eee 87

Match the column of file1 with first column of file2 and print out lines in file2 with corresponding highest value in column2 of file2.
desired output

aaa 443
ggg 33
ddd 98
vvv 32
eee 87

thanks in advance.

Try this:

awk 'NR==FNR{a[$1]=1;next}($1 in a) && $2 > a[$1]{a[$1]=$2}END{for(i in a)print i, a}' file1 file2
1 Like
awk 'NR==FNR{a[$1]=1;next}($1 in a) && $2 > a[$1]{a[$1]=$2}END{for(i in a)print i, a}' file1 file2

Hi Franklin52, thanks a lot for the reply. Its working just fine.

However, I also have another set of files which are similar to the above mentioned but slightly different orientation. And I couldnt apply your code to them. Here are the files.

file1

aaa  
ggg
ddd
vvv
eee

and file2

ACC 2 2 21 aaa 
AC 443 3 22 aaa  
GCT 76 1 33 xxx 
TCG 34 2 33 aaa 
ACGT 33 1 22  ggg 
TTC 99 3 44 wee 
CCA 33 2 33 ggg 
AAC 1 3 55 ddd 
TTG 10 1 22 ddd 
TTGC 98 3 22 ddd 
GCT 23 1 21 sds 
ACGT 32 2 33 vvv 
CGT 11 2 33 eee 
CCC 87 2 44 eee 

As you can see there are 5 columns now in file2. Match the column of file1 with fifth column of file2 and print out the complete lines in file2 with corresponding highest value in column2 of file2. If the values are same and no highest value, just print the first line of matching.
desired output

AC 443 3 22 aaa  
ACGT 33 1 22  ggg
TTGC 98 3 22 ddd 
ACGT 32 2 33 vvv 
CCC 87 2 44 eee

thanks in advance.

thanks again

awk 'NR==FNR{a[$1]=1;next} a[$5]{if($2 > a[$5]){a[$5]=$2;b[$5]=$0}} END{for(i in b)print b}' file1 file2
1 Like

thanks a lot :b::b: