Compare two CSV files and put the difference in third file with line no,field no and diff value.

I am having two csv files i need to compare these files and the output file should have the information of the differences at the field level.

For Example,

File 1:

A,B,C,D,E,F
1,2,3,4,5,6

File 2:

A,C,B,D,E,F
1,2,4,5,5,6

out put file:

Line 1;Field 2 diff Value B=C
Line 1;Field 3 diff Value C=B
Line 2;Field 3 diff Value 3=4
Line 2;Field 4 diff Value 4=5
etc.....

Please any one can help

I used diff but was only able to get line numbers.I als need field numbers and diff values.

$ awk -F, 'NR==FNR{for(i=1;i<=NF;i++){A[i,NR]=$i}next}
{for(i=1;i<=NF;i++){if(A[i,FNR]!=$i){print "Line",NR";Field",i,"diff value",A[i,FNR]"="$i}}}' file1 file2

Line 3;Field 2 diff value B=C
Line 3;Field 3 diff value C=B
Line 4;Field 3 diff value 3=4
Line 4;Field 4 diff value 4=5
1 Like

when i run the above command i am getting not found error

What not found ? command?

Post the error here!!.

rwxd256:dotisdm1:/home/corebank: $ cd /opt/DataMigration/data/NKARING
rwxd256:dotisdm1:/opt/DataMigration/data/NKARING:  $ awk -F, 'NR==FNR{for(i=1;i<=NF;i++){A[i,NR]=$i}next} 
continue...> ";Field",i,"diff value",A[i,FNR]"="$i}}}' taxdetails.csv RC023_test.csv <
ksh: $: not found
rwxd256:dotisdm1:/opt/DataMigration/data/NKARING: $ 

---------- Post updated at 12:50 AM ---------- Previous update was at 12:48 AM ----------

when i run this command in unix i got this error. i also used it in script

It looks like you have done some mistakes while copying the code.
Where is a start bracket after next.???
where is print..? where is for loop..?

Please copy the code correctly and then post the results.

#!/bin/bash
$awk -F, 'NR==FNR{for(i=1;i<=NF;i++){A[i,NR]=$i}next}
{for(i=1;i<=NF;i++){if(A[i,FNR]!=$i){print "Line",NR,";Field",i,"diff value",A[i,FNR]"="$i}}}' taxdetails.csv RC023_test.csv

i am using this in script and named it as Diff.sh

but when i run this script it is giving error as

rwxd256:dotisdm1:/opt/DataMigration/data/NKARING: $ Diff.sh
Diff.sh: line 2: -F,: command not found
rwxd256:dotisdm1:/opt/DataMigration/data/NKARING: $ 

I am new to unix please help

Please remove above highlighted $

Have you tried awk script instead of .sh

try

$ cat awk.awk
NR==FNR{for(i=1;i<=NF;i++){A[i,NR]=$i}next}
{for(i=1;i<=NF;i++){if(A[i,FNR]!=$i){print "Line",NR";Field",i,"diff value",A[i,FNR]"="$i}}}

$ awk -F, -f awk.awk file1 file2
Line 3;Field 2 diff value B=C
Line 3;Field 3 diff value C=B
Line 4;Field 3 diff value 3=4
Line 4;Field 4 diff value 4=5

This time i get different error:

rwxd256:dotisdm1:/opt/DataMigration/data/NKARING: $ Diff.sh
ksh: cannot fork: too many processes

#!/bin/bash


awk -F ,'NR==FNR{for(i=1;i<=NF;i++){A[i,NR]=$i}next}
{for(i=1;i<=NF;i++){if(A[i,FNR]!=$i){print "Line",NR,"Field",i,"diff value",A[i,FNR]"="$i}}}' taxdetails.csv RC023_test.csv

Have you tried my second suggestion with awk file..?

Have you tried my second suggestion with awk file..?
yes got the same error

rwxd256:dotisdm1:/opt/DataMigration/data/NKARING: $ awk.awk
ksh: cannot fork: too many processes

Could you please read my post again(post # 8)

Thank you Pamu it worked.