Lookup file

Hi,

need your help to lookup these 2 files

main.txt

RNPMS01,PMS717W_Marasi,CXP9016141/1_R7G04,EXECUTING
RNPMS01,RAP765W_BakaranBatu,CXP9014346/1_R6AG03,EXECUTING
RNPMS01,RNPMS01,CXP9014711/2_R5Z,EXECUTING
RNPMS01,TBT510W_Bandar_Utama,CXP9014346/1_R6AG03,EXECUTING
RNPMS01,PMS209W_Simpang_Dua,CXP9016868/1_R4G02,EXECUTING
RNPMS01,PMS207W_Ade_Irma,CXP9014346/1_R6AG03,EXECUTING
RNPMS01,PMS224W_Suzuya_Siantar,CXP9014346/1_R6AG03,EXECUTING
RNPBO01,PBO043W_KTI_Probolingo,CXP9014346/1_R6AG03,EXECUTING

code.txt

CXP9016141/1_R7G04,P5.1.2
CXP9014346/1_R6AG03,W10.1.3.7 
CXP9014711/2_R5Z,P7.1.4
CXP9016868/1_R4G02,P6.1.2.3

Expected output:

RNPMS01,TBT510W_Bandar_Utama,W10.1.3.7,EXECUTING

Thanks

BR
///Singgih

Can you describe how you arrive at this output? I imagine replace 3rd field of main.txt with 2nd field of code.txt, but your output shows 1 line and there are many matches for "CXP9014346/1_R6AG03"

[mute@geek ~/temp/singgih]$ ./script code.txt main.txt
RNPMS01,PMS717W_Marasi,P5.1.2,EXECUTING
RNPMS01,RAP765W_BakaranBatu,W10.1.3.7,EXECUTING
RNPMS01,RNPMS01,P7.1.4,EXECUTING
RNPMS01,TBT510W_Bandar_Utama,W10.1.3.7,EXECUTING
RNPMS01,PMS209W_Simpang_Dua,P6.1.2.3,EXECUTING
RNPMS01,PMS207W_Ade_Irma,W10.1.3.7,EXECUTING
RNPMS01,PMS224W_Suzuya_Siantar,W10.1.3.7,EXECUTING
RNPBO01,PBO043W_KTI_Probolingo,W10.1.3.7,EXECUTING

is that correct?

Yes, that's expected output
Can you show me the script?

Thanks

BR
///Singgih

Oh yes, sorry. I suppose I could have posted it while awaiting confirmation.

#!/usr/bin/awk -f
BEGIN { FS=OFS="," }
FNR==NR { a[$1]=$2; next }
($3 in a) { print $1, $2, a[$3], $4 }

Is standard awk lookup. FNR==NR rule reads in file1, placing column 2 into an array (keyed with column 1).

Also as 1-liner: awk -F, 'FNR==NR{a[$1]=$2;next}($3 in a){print $1,$2,a[$3],$4}' OFS=, code.txt main.txt

1 Like

Hi neutronscott,

Million thanks,
can you explain meaning of this:

FNR==NR { a[$1]=$2; next }

BR
///Singgih

I thought I did? :slight_smile: NR is internal variable which holds the record number. So basically is the line number in this case. FNR is per-file, so if they're equal you're currently processing a line in the first file. the action { a[$1]=$2; next } first assigns an array using the first field as the key. the next means go on to the next line without further processing (skips the remaining print block).

so we're at the 2nd file and FNR==NR is no longer true. (imagine the first file is just 4 lines as your example, the first line in the second file FNR would be 1 while NR would continue to 5). In that case we check if the 3rd field matches any of the stored elements ($3 in a) and print...

1 Like

Thank you for explanation of the script.

The above script will give the print out all lines in file2 that 'match' with record in file1. Is possible to print unmatched line? expected print out for unmatched record as below example, if CXP9016868/1_R5G08 is not in the record:

RNPMS01,PMS209W_Simpang_Dua,CXP9016868/1_R5G08,UNKNOWN

looking forward for reply

print it unchanged from the input?

BEGIN { FS=OFS="," }
FNR==NR { a[$1]=$2; next }
($3 in a) { $3=a[$3] }
1 { print }

This way all records are printed, and ones that match will have field3 replaced.

Thank you .....