awk to replace values in one file using a second reference file

aberg · April 21, 2016, 10:59am

Hi,

I'd be grateful for your help with the following:

I have a file with a single column (file1). Let's say the values are:

a
b
c
5
d

I have a second, reference file (ref_file), which is colon-delimited, and is effectively a key. Let's say the values in it are:

a:1
b:2
c:3
d:4
etc.

I want to use an awk command to scan through file 1, and replace the values with the elements in the reference file, so that the output will be:

(I want values that are already in the correct format in file 1 to be left alone - the '5' in this case).

The script I have tried is:

awk -F: 'FNR==NR{a[$1]=$2;next} {for (i in a)sub(i, a);print}' ref_file file1

This doesn't work for some reason, and I don't know why. When I tried the script on a much shortened version of file1, the error message that I get is: awk: can't open file file1.

Any help/suggestions would be much appreciated.
Many thanks.

Yoda · April 21, 2016, 11:01am

awk -F: 'NR==FNR{a[$1]=$2;next}$1 in a{$1=a[$1]}1' ref_file file1

Scrutinizer · April 21, 2016, 11:08am

@OP, you script works fine here.. What is your OS and version? Perhaps you made a typo somewhere , maybe the quotes?

---
Slightly different alternative approach, which makes the script a bit more robust by eliminating spaces if present in file1:

awk 'FNR==NR{A[$1]=$2; next} $1 in A{$1=A[$1]}1' FS=: ref_file FS=" " file1

aberg · April 21, 2016, 11:12am

Thanks for the quick reply. I've tried your version of the script. While it's now reading the files and actually executing the command, the output is completely unchanged from the original file 1...

---------- Post updated at 10:12 AM ---------- Previous update was at 10:09 AM ----------

Scrutinizer, your version of the code appears to have worked.

Thank you both very much.

Scrutinizer · April 21, 2016, 11:15am

Can it be that your input files are in DOS format (contain carriage returns)?

Try:

tr -d '\r' file > file.new

first and on both files and try again...

---
Otherwise there may be excess spaces in file1. My script should take care of this...