Need help in comparing two files in UNIX with a mismatch

karthik_adiga · January 27, 2015, 6:04am

Hi Everyone,
I am comparing results of two environments using unix files.

I am writing two different csv file using spool and merging both the files using paste command

paste -d, file1.csv file2.csv > prod_uat_result.csv

and then finding the difference and attaching the same in a mail

cat prod_uat_result.csv |awk -F"," '{print $1,",",$2,",",$3,",",$4,",",$5,",",$6,",",$7,",",$8,",",$9,",",$10,",",$11,",",$12
,",",$13,",",$14,",",$15,",",$16,",",$17,",",$6-$14}' > prod_uat_result1.csv

uuencode prod_uat_result1.csv prod_uat_result1.csv > prod_uat_result1.txt

Under normal circumstances it is working fine.But if there is some record missing in file2.csv then it should skip the comparision.

ex
file1.csv

1,10
1,20
2,10
2,20
2,30
3,40
4,50

file2.csv

1,10
1,20
2,10
2,20
2,30
4,50

So,ideally (3,40) in file1.csv should not be compared with 4,50 of file2.csv to find the difference.

I have tried left outer join but it will not work out since there is no primary key,have also tried giving row number while writing to a file but that will also not compare the right results.

My requirement is, if row is not present in the second file (file2.csv) then the comparison should be skipped and compare next row.

Could you please help me with this

RudiC · January 27, 2015, 6:13am

Please use code tags as required by forum rules!

Is there any common field in both files by which the comparison could be done, or the missing row be identified, respectively?

RavinderSingh13 · January 27, 2015, 6:16am

Hello karthik adiga,

Kindly use code tags as per forum rules, assuming you need differances only with respect to the lines which are which are present in file1 and NOT present in file1 then following may help you. But it wouldn't tell differance lines which are present in file2 and NOT in file1.

awk -F, 'FNR==NR{X[$1 FS $2]=$1 FS $2;next} ($0 in X){print X[$0] " entry from file1 is present in file2.";delete X[$0]} END{for(i in X){print X " entry NOT present in file2."}}' file1.csv file2.csv

Output will be as follows.

1,10 entry from file1 is present in file2.
1,20 entry from file1 is present in file2.
2,10 entry from file1 is present in file2.
2,20 entry from file1 is present in file2.
2,30 entry from file1 is present in file2.
4,50 entry from file1 is present in file2.
3,40 entry NOT present in file2.

Thanks,
R. Singh

karthik_adiga · January 27, 2015, 7:09am

Thanks Ravindra for the solution.

I am getting syntax error when trying to execute the same.

cib-sokay2{u384283}167:awk -F, 'FNR==NR{X[$1 FS $2]=$1 FS $2;next} ($0 in X){print X[$0] " entry from file1 in file2.";
> delete  X[$0]} END{for(i in X){print X " entry NOT present in file2."}}' file1.csv file2.csv
awk: syntax error near line 1
awk: bailing out near line 1

I am new to unix.Could you please help me in solving this

Akshay_Hegde · January 27, 2015, 7:16am

If you are on Solaris/SunOS system, change awk at the start of the script to /usr/xpg4/bin/awk or /usr/xpg6/bin/awk or nawk .

RavinderSingh13 · January 27, 2015, 7:17am

Hello karthik,

Seems you are using a Solaris/SunOS system, on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk .

Thanks,
R. Singh

karthik_adiga · January 27, 2015, 7:46am

Thanks a lot Ravinder and Akshay..its working

Akshay_Hegde · January 27, 2015, 7:56am

Since you are comparing line, you can also try like this

[akshay@localhost tmp]$ cat f1
1,10
1,20
2,10
2,20
2,30
3,40
4,50

[akshay@localhost tmp]$ cat f2
1,10
1,20
2,10
2,20
2,30
4,50

[akshay@localhost tmp]$ nawk  'FNR==NR{X[$0];next}$0 in X{print $0,"- In both";delete X[$0]}END{for(i in X)print i ,"- Not In",FILENAME}' f1 f2
1,10 - In both
1,20 - In both
2,10 - In both
2,20 - In both
2,30 - In both
4,50 - In both
3,40 - Not In f2

karthik_adiga · January 30, 2015, 4:43am

Hi,

Could you please help me out with this requirement

awk -F, 'FNR==NR{X[$1 FS $2]=$1 FS $2;next} ($0 in X){print X[$0] " entry from file1 is present in file2.";delete X[$0]} END{for(i in X){print X " entry NOT present in file2."}}' file1.csv file2.csv

i need output like

file1.csv
1,10
1,20
2,10
2,20
2,30
3,40
4,50
 
file2.csv
1,10
1,20
2,10
2,20
2,30
4,50
 
o/p 
1,10,1,10 
1,20,1,20
2,10,2,10
2,20,2,20
2,30,2,30
4,50,4,50
3,40,

So basically , it should print both file1 and file2 values and
null when not matching.

The above mentioned command prints values of file1 only.

what changes should be done in the command....can i use an awk inside nawk?

---------- Post updated at 04:43 AM ---------- Previous update was at 04:12 AM ----------

Sorry guys,

that was a silly question

got it ..

nawk  'FNR==NR{X[$0];next}$0 in X{print $0,",",$1;delete X[$0]}END{for(i in X)print i}' file1.csv file2.csv