I need to compare two files which have the following structure
File1:
No : 1
Name : George/Brown
Value2 : type2
Value3 : type3
Date : Wed Oct 20 11:12:58 2010
Value : yes
No : 2
Name : John/Cash
Value2 : type2
Value3 : type3
Date : Wed Oct 20 11:12:58 2010
Value : 17
No : 3
Name : Maria/Blond
Value2 : type2
Value3 : type3
Date : Wed Oct 20 11:12:58 2010
Value : yes
File2:
No : 1
Name : George/Brown
Value2 : type2
Value3 : type3
Date : Wed Jan 20 12:12:34 2010
Value : yes
No : 2
Name : John/Cash
Value2 : type2
Value3 : type3
Date : Wed Oct 20 13:15:45 2010
Value : 14
No : 3
Name : Maria/Blond
Value2 : type2
Value3 : type3
Date : Wed Oct 20 12:12:54 2010
Value : no
I need the output to be like
Name : John/Cash
Value(file1) : 17
Value(file2) : 14
Name : Maria/Blond
Value(file1) : yes
Value(file2) : no
Then, you can sort them and run the two files through comm -3 to get lines like this (\t is tab):
\t2|John/Cash|Value|14
2|John/Cash|Value|17
Then, a sed script can marry the lines back together and create your format (if you are fussy). One challenge is the the file 2 line may sort low, and come out first, as in my example. Also, output will not be in original order, but sorted ascending binary order.
Writing a merge routine is a bit much for scripts, but it can be done. Many programmers muff the logic even in more powerful languages, so using off the shelf tools is a big win.
Maybe the awk guys have a way to deal with it. How invariant is the format? Can lines come and go or move around. Your example has 2 different headers on rec no.
I suppose you could parse one file and for data lines, get the same line # from the other file to compare. You could mark the name lines so they never compare, and then use diff in one of its modes to show you where the changes are.
The diff has a mode I like a lot, '-C 999999', where all lines are present, marked +-=, so you could parse the diff output capturing =name lines and reporting -+ lines in one stream in a 'while read l do done' loop. Try that. Many ways to skin cat in UNIX!
Ok it does not exactly fit the wanted output, but the information are here (it seems like you are not interested by the difference of the date field so i just skipped it).
in1 is File1
in2 is File2
This way the files could be just grep on the name and/or comm ,and/or ...| sort | uniq or just chose a key@
where <key> could be the value of No instead of the value of Name