awk to update field in file based of match in another

I am trying to use awk to match two files that are tab-delimited . When a match is found between file1 $1 and file2 $4 , $4 in file2 is updated using the $2 value in file1 . If no match is found then the next line is processed. Thank you :).

file1

uc001bwr.3    ADC
uc001bws.3    ADC
uc001bwt.1    ADC
uc001bwu.3    ADC
uc001bwv.3    ADC
uc001bwx.1    ADC
uc001bwy.1    ADC
uc001bwz.1    ADC
uc001chv.2    LEPRE1
uc001chw.2    LEPRE1
uc001chx.4    LEPRE1
uc001chy.4    LEPRE1

file2

chr1    33546703    33546905    uc001bwr.3
chr1    33546978    33547119    uc001bwr.3
chr1    33547191    33547423    uc001bwr.3
chr1    43211995    43212533    uc001chw.2
chr1    43212913    43213093    uc001chw.2

desired output tab-delimeted

chr1    33546703    33546905    ADC
chr1    33546978    33547119    ADC
chr1    33547191    33547423    ADC
chr1    43211995    43212533    LEPRE1
chr1    43212913    43213093    LEPRE1

awk

awk -F'\t' -v OFS='\t'  'NR==FNR{a[$1]=4}NR!=FNR{if(a[$4])' file1 file2
awk: cmd. line:1: NR==FNR{a[$1]=4}NR!=FNR{if(a[$4])
awk: cmd. line:1:                                  ^ unexpected newline or end of string

That command is missing all the highlighted to not complain, however it will not provide what you are asking.

awk 'FNR==NR { a[$1]=$2; next } { if(a[$4]){$4=a[$4] }; print }' OFS="\t" file1 file2
1 Like

Hello cmccabe,

Could you please try following and let me know if this helps you.

awk  -F"\t" 'FNR==NR{A[$1]=$2;next} ($NF in A){$NF=A[$NF];print}' OFS="\t"  Input_file1   Input_file2
 

Output will be as follows.

chr1    33546703        33546905        ADC
chr1    33546978        33547119        ADC
chr1    33547191        33547423        ADC
chr1    43211995        43212533        LEPRE1
chr1    43212913        43213093        LEPRE1
 

Thanks,
R. Singh

1 Like

Be aware that this will delete any line in Input_file2 that does not have a match in Input_file1.

1 Like

Thank you both :).