Replace Value of nth Column of Each Line Using Array

Hello All,

I am writing a shell script with following requirement:

  1. I have one input file as below
CHE01,A,MSC,INO
CHE02,B,NST,INC
CHE03,C,STM,INP
  1. In shell script I have predefined array as below:

Array1={A, B, C}
Array2= {U09, C04, A054}

Now I wish to replace second column of input file by corresponding array value of Array2. If second column of input file matches with third element of Array1, it will be replaced by Array2(3) = A054 and so on. How do I achieve this using awk? I am avoiding to use while loop as it will consume lot of time when operated against large data set.

Your help is honestly appreciated

Thanks
Angsuman

What operating system are you using?

What shell are you using?

What have you tried to solve this problem on your own?

Why use two arrays? It would be MUCH simpler to do this with a single array using the old value as the subscript and the new value as the value for that element.

How are those two arrays populated? From those "large data sets", perhaps? It would be much easier for awk to operate on file(s) should these be used to define the arrays.

awk  'NR==FNR { a[$1]=$4; next } {if ($2 ~ a["Array1"]) $2=a["Array2"]; print}' FS="= *{|, *|}" arrayfile FS=, inputfile

Hi abdulbadii,
Would you please explain how the above script is supposed to do something related to the topic of this thread?

If I slightly modify your script to add a debugging statement showing how the array a is set:

awk  '
NR==FNR {
	a[$1]=$4
	printf("a[%s] has been set to \"%s\"\n", $1, $4)
	next
}
{	if ($2 == a[array1])
		$2=a[array2]
}
' FS=,={} arrayfile FS=, inputfile

it produces the output:

a[Array1={A, B, C}] has been set to ""
a[Array2= {U09, C04, A054}] has been set to ""

when the arrayfile and inputfile files contain the data shown in post #1 in this thread.

Note that since neither of the awk variables array1 and array2 have been set, the if statement will never find a match. And, even if a match were found and the 2nd field was changed to an empty string by the assignment, what difference would it make? There is nothing in this awk script that produces any output.

Note also that using the fixed string =,={} as a field separator for the 1st input file seems strange since that string is never found in either of your input files. That means that each line will be field #1 for that line and field $4 in each line will be an empty string (which is exactly what we see from the added debugging statement).

Thank you all for your reply. As suggested by RudiC, I have changed my script to use another file instead of pre-defined array. Here is the code used:


awk -F',' 'NR==FNR { a[$1]=$2; next } $2 in a {print $1 "," a[$1] "," a[$2] "," $3 "," $4 }' $CONFIGFILE $INPUTFILE

Now CONFIGFILE contains data as below:


A,U09
B,C04
C,A054

Above code is working fine in this case but it does not work when config file has data as below:


a,U09
b,C04
c,A054

How do I tell awk to ignore the case

Thanks
Angsuman

How about using awk 's toupper() function?

BTW, your code snippet won't deliver what was requested in post#1 as a[$1] will always be undefined and thus deliver an empty field (= ,, ) in the output.