Compare two files, if the columns are same then replace next column with some value

Hi Team,

As i'm new to the shell scripting, i stuck in a place to replace a column valuein a file when compared with other file. Could you please help me to work on this?
I have 2 files test1.csv and test2.csv. I need to replace empdep column in test1.csv if it has the value as "sign*". I have second file test2.csv which have the value need to replace that" sign*".

Note : I'm using ksh and test1.csv have around 2,048,576 rows and test2.csv have 10000 rows

test1.csv

empname,place,empdep
aaaa ,city1,001
bbbb,city2,sign-1
dddd,city1,005
ffff,city5,sign-2
hhhh,city7,sign-1

test2.csv

empname ,new
aaaa ,001 
bbbb,002
cccc,003
dddd ,005
eeee,006
ffff,007
gggg,008
hhhh,009

Expected Result:

empname,place,empdep
aaaa ,city1,001
bbbb,city2,002
dddd,city1,005
ffff,city5,007
hhhh,city7,009

Try

awk -F, 'NR==FNR {T[$1]=$2; next} $3 ~ /^sign/ {$3=T[$1]} 1' OFS=, file2 file1
empname,place,empdep
aaaa ,city1,001
bbbb,city2,002
dddd,city1,005
ffff,city5,007
hhhh,city7,009
2 Likes

Hi Rudic,
I just tried it in a sample file... Its working....

Thanks a lot :slight_smile:

Could you please explain the command, it will help me to understand?

When working on the first input (file2), NR == FNR , it collects field 2's values into an array indexed by field 1.
When working on the second, it checks if field 3 begins with sign , and replaces this with the respective array value (agin indexed by field 1).
The final 1 represents a "TRUE" result and starts the default action which is print which by default prints $0 .

2 Likes

I have tried for "I" separated file with below command got error as below... could u please help on this?

awk -F| 'NR==FNR {T[$1]=$2; next} $7 ~ /^Sync/ {$7=T[$1]} 1' OFS=| file2.csv file1.csv
awk: A flag requires a parameter: F
Usage: awk [-u] [-F Character][-v Variable=Value][-f File|Commands][Variable=Value|File ...]
ksh: NR==FNR {T[$1]=$2; next} $7 ~ /^Sync/ {$7=T[$1]} 1:  not found.

Try escaping / quoting the pipe character to use it for the field separator; else it will be interpreted by the shell for a command pipeline (and fail).

I tried the below command, but got empt space as a result

awk 'BEGIN{ FS=OFS="|" }; NR==FNR {T[$1]=$2; next} $7 ~ /^sign {$7=T[$1]} 1' OFS="|" file1.csv file2.csv> final.csv

wrong result please check below

 xfsgdh|gdjshd|bbbb|20191228|230||sign-1|14420|44766|||||||20191228 ---  original 

xfsgdh|gdjshd|bbbb|20191228|230|||14420|44766|||||||20191228 -  result after code

Looks like T[$1] is the empty string? What's your input data?