Compare two files based on values of fields.

Hi All,
I have two files and data looks like this:

File1 Contents

#Field1,Field2
Dist_Center_file1.txt;21
Dist_Center_file3.txt;20
Dist_Center_file2.txt;20

File2 Contents (*** No Header ***)

Dist_Center_file1.txt;23
Dist_Center_file2.txt;20
Dist_Center_file3.txt;20

I have to look for 1st field value (Dist_Center_file1.txt) in 2nd file and then compare the 2nd field values (Separated by semi colon).

If there is a difference in 2nd value then write out a difference record to an output file. For example records in this output file will be:

** Counts are different for Dist_Center_file1.txt, File1 Cnt:21 File2 Cnt:23 *

Here File1, File2 names will be constants, Dist_Center_file1.txt is the value from 1st file. 21 is the count from 1st file and 23 is count from 2nd file.

When 1st file is done, start with 2nd file and make sure that all records of 2nd file are present in 1st file.

How can I do this??

:confused:

Thanks in advance guys!!

# cat File1
Dist_Center_file1.txt;21
Dist_Center_file3.txt;20
Dist_Center_file2.txt;20
# cat File2
Dist_Center_file1.txt;23
Dist_Center_file2.txt;20
Dist_Center_file3.txt;20
# awk -F\; 'NR==FNR{a[$1]=$2;next}a[$1]!=$NF{printf "** Counts are different for %s, %s Cnt:%d %s Cnt:%d *\n",$1,ARGV[1],a[$1],ARGV[2],$NF}' File1 File2
** Counts are different for Dist_Center_file1.txt, File1 Cnt:21 File2 Cnt:23 *

Thanks for your quick response.

Let me try and I will update you.

;);

---------- Post updated at 06:49 PM ---------- Previous update was at 06:35 PM ----------

It is comparing fine.

If you don't mind, could you please explain the answer in detail??

Kind of like, explain for dummies!!

:smiley:

what's the point to get the special output format?

If you don't care, use below diff command directly.

diff <(sort File1) <(sort File2)

1c1
< Dist_Center_file1.txt;21
---
> Dist_Center_file1.txt;23

Hi I have 2 flat files and want to get count diff w/o counting the header meant one file has header and other don't have one.so if first file has 10 count and second has 11 (with header) ,how can i acheive this.Simply getting count after header(#Date,Quantity'10/26/2010';123)