Edited: compare two files and print mismatch

Using unix shell script, how to compare two files and print lines with mismatch? Below are the requirements:

  1. The number of lines on the two files is not the same.
  2. The difference/mismatch can be found on the second or third column.
  3. The comparison is not between line 1 of file 1 and line 1 of file 2. Rather, the comparison is on line 1 of file 1 and the line on file 2 that has the same first word on the line 1 of file 1.

To demonstrate:
FILE 1:
abc 123 678
def 456 901
ghi 789 234
jkl 012 567
mno 345 890
FILE 2:
def 456 901
abc 124 678
mno 345 890
ghi 789 244
OUTPUT FILE:
"from file 1"
abc 123 678
ghi 789 234
"from file 2"
abc 124 678
ghi 789 244

i hope someone can help me with this. Thanks!

Try this..

awk '{ if (FNR==NR) {arr[$1]=$0;next}
if (($1 in arr) && ($0!=arr[$1])) { f1[$1]=arr[$1]; f2[$1]=$0; next} } 
END { print "from file 1";  for (i in f1) {print f1}; print "from file 2"; for (i in f2) {print f2}  } ' file1 file2 > file3

Assumption: First word is unique in a given file. please let me know if you need to handle duplicates also so that I can try for that.

hi king! the code you gave me doesn't work..
it doesn't print the lines with difference though i am sure that there are lines that have mismatch on the two files.

Thanks!

What have you tried?

i tried diff command But i learned that in diff command, it compares line by line.
while in cat file1 file2 | sort | uniq -u > file 3, it yileds:

ABC 123
ABC 321
DEF 412
DEF 124

and when i used it on my script, it yields a odd number of lines.

Is this what you want...

sort file1 > file1_tmp
sort file2 > file2_tmp
sdiff file1_tmp file2_tmp | grep '|'

output:

abc 123 678                                                     |  abc 124 678
ghi 789 234                                                     |  ghi 789 244

output should be:

"FROM FILE 1"
ABC 123
ABC 321

"FROM FILE2"
DEF 412
DEF 124

Why dont you give it a try... I have given you the output, you just have to format it.

I got it! I used the 'sort'. Thanks thanks!

Great...

Did you check content in file3 because I redirected output to file 3?
I tested the code with the inputs you have given and I could match your output also..

If you want the result to be printed on the screen then remove the "> file3" part from the code and run it..

@ king kalyan: it worked! but i have to sort 1st. Thanks for the help

This code works in case of difference is found how about those records tht are missing that is also part of it right??
For Ex:

File1

abc 123 678
abc 112 111
xyz 100 000

File2

abc 123 678
abc 112 112

Output of above code will be:

abc 112 111 | abc 112 112

But Output shud be

abc 112 111 | abc 112 112
xyz 100 100 |

Please help with that code....... As xyz 100 100 is also a difference as it is missing in 2nd file....

hmm...
that was limited to my understanding of king's problem.
You can very well Try... No, sorting, no formatting... :slight_smile:

$ grep -v -f file1 file2
abc 112 112
$ grep -v -f file2 file1
abc 112 111
xyz 100 000

@rakesh,

Nope i was looking in same format as before..... the one using sdiff code worked fine except for records which are missing...
And its not file1 against file2 or vice verse.. Have to compare both files.. See example belore again:
File1

abc 1 1 1
abc 2 2 2
abc 3 3 3
abc 5 5 5

File2

abc 1 1 1
abc 2 1 2
abc 3 3 3
abc 4 4 4

Output:

abc 2 2 2 | abc 2 1 2
| abc 4 4 4
abc 5 5 5 |

Not sure...
may be compareIt can do it... never worked with that.

hi all!

i just want to ask fo rhelp regarding this... the requirements is the same as the original problem stated. the only difference is that that the comparison is only on 2nd and third columns.

thanks!