Vlookup multiple file and awk

Hello Folks,
What I wish to do is:
If first column matches in main and new file, then
paste $COL2 into output file. Something like vlookup . Please see also bold text in expected output.

mainfile

11
22
33
44
55
66
77
88
99
100
101
102

newfile1

11 MONDAY24
22 WEDNESDAY42
33 THURSDAY52
44 FRIDAY62
55 TUESDAY72
66 FRIDAY99
51 WHAT01
77 SATURDAY22
88 SUNDAY22

newfile2

11 MONDAY
22 WEDNESDAY
33 THURSDAY
44 FRIDAY
55 TUESDAY
66 FRIDAY
51 WHAT
77 SATURDAY
88 SUNDAY
101 MONTAG
109 DIENSTAG
115 MITTWOCH
55 DONNERSTAG

Expected output:

11 MONDAY24 MONDAY
22 WEDNESDAY42 WEDNESDAY
33 THURSDAY52 THURSDAY
44 FRIDAY62 FRIDAY
55 TUESDAY72 TUESDAY DONNERSTAG
66 FRIDAY99 FRIDAY
77 SATURDAY22 SATURDAY
88 SUNDAY22 SUNDAY
99
100
101 MONTAG
102

I run awk code with two steps:

awk 'NR==FNR {m[$1]=$1"\t"$2; next} {$1=m[$1]; print}' mainfile newfile1 > output
awk 'NR==FNR {m[$1]=$1"\t"$2; next} {$1=m[$1]; print}' mainfile newfile2 >> output

This one does not works as expected.
Could you please let me know how to solve this?
Awk is not a must. Easily understandable solutions are welcome.

Thanks
Boris

awk -f baris.awk mainfile newfile1 newfile2 where baris.awk is:

FNR==NR { a[$1];next}
$1 in a {a[$1]=a[$1] OFS $2}
END {
  for (i in a)
    print i,a
}
1 Like

Thanks Vgersh99,
Gives expected output.

Kind regards
Boris

Dear Vgersh99,
One note:
I do not understand why output is not in comply with the list of order in mainfile when I change newfile1 and newfile2 as follows:

mainfile

AA_11
BB_22
CC_33
DD_44
EE_55
FF_66
GG_77
HH_88
II_99
JJ_100
KK_101
LL_102

newfile1

AA_11 MONDAY24
BB_22 WEDNESDAY42
CC_33 THURSDAY52
DD_44 FRIDAY62
EE_55 TUESDAY72
FF_66 FRIDAY99
GG_51 WHAT01
HH_77 SATURDAY22
II_88 SUNDAY22

newfile2

AA_11 MONDAY
BB_22 WEDNESDAY
CC_33 THURSDAY
DD_44 FRIDAY
EE_55 TUESDAY
FF_66 FRIDAY
GG_51 WHAT
HH_77 SATURDAY
II_88 SUNDAY
JJ_101 MONTAG
KK_109 DIENSTAG
LL115 MITTWOCH
MM_55 DONNERSTAG

This way, output is:

AA_11  MONDAY24 MONDAY
II_99
KK_101
GG_77
HH_88
FF_66  FRIDAY99 FRIDAY
DD_44  FRIDAY62 FRIDAY
EE_55  TUESDAY72 TUESDAY
BB_22  WEDNESDAY42 WEDNESDAY
JJ_100
CC_33  THURSDAY52 THURSDAY
LL_102

I was expecting:

AA_11  MONDAY24 MONDAY
BB_22  WEDNESDAY42 WEDNESDAY
CC_33  THURSDAY52 THURSDAY
DD_44  FRIDAY62 FRIDAY
EE_55  TUESDAY72 TUESDAY
FF_66  FRIDAY99 FRIDAY
GG_77
HH_88
II_99
JJ_100
KK_101
LL_102

Thank you
Boris

--- Post updated at 08:04 AM ---

Hello,
Have just sorted out.

awk 'FNR == NR { lineno[$1] = NR; next} {print lineno[$1], $0;}' mainfile outputfile | sort -k 1,1n | cut -d' ' -f2-

Many thanks
Boris

The order of iteration over the array indicies is not defined/guaranteed.
From gawk manual:

Sorting the output, assumes that the entries in the mainfile is sorted as well, which might always be true.
A small change to the script guarantees the output in the same order as in he mainfile .

FNR==NR { a[$1];ord[FNR]=$1;next}
$1 in a {a[$1]=a[$1] OFS $2}
END {
  for (i=1;i in ord;i++)
    print ord,a[ord]
}
1 Like