Show the Difference between two files

I have two files and I need to know the difference between each line. This will extend to thousand lines and manual works is really not really an option.
sample:

First File Second File
allan entry1 entry2 entry3 allan entry1 entry3
bob entry1 entry2 entry3 entry4 bob entry1 entry4

I want to output the difference only. Sample output:

allan entry2
bob entry2 entry3

I know that simple grep will not work here. UNIX is new to me and this is the only way i think this can be done.

Hi,

First, I am not very clear about the way you explained the content of two files and the sample output.Very sorry abt that.

But you can do file comparison using "comm" or "diff" or "cmp" commands...

Thnx
Dennis

Sorry about that. These are the files.

First File
allan entry1 entry2 entry3
bob entry1 entry2 entry3 entry4

Second File
allan entry1 entry3
bob entry1 entry4

Output:
allan entry2
bob entry2 entry3

Using diff and comm doesnt give me the result that I want. The result should only be the name and the the entry that cant be found on the second file. In my example, allan will have entry 2 since it is not on the second file.

Are the two files guaranteed to have the same number of lines? So if a bob line is in one file it is also in the other? Are the records in the same sequence in both files?

If you have Python and know the language, here's an alternative:

#!/usr/bin/python
for line in open("file2"):
    line = line.strip() #get rid of newlines
    name,entry = line.split(' ',1)
    for lin in open("file1"):
        lin = lin.strip()
        if lin.startswith(name):
                for e in entry.split():
                    lin = lin.replace(e , "")
                print "Output: ", lin

output:

/test # ./test.py
Output:  allan  entry2
Output:  bob  entry2 entry3

Yes, they have same number of lines and also on same sequece.

I dont have a python so I cant use the code given by ghostdog74.

awk ' BEGIN { while ( getline < "first_file" ) { arr[$1]=$0; } }
{ for( i = 2 ; i <= NF ; ++i ) 
	sub($i,"",arr[$1])  
  gsub("  +"," ",arr[$1])
  print arr[$1] 
} ' second_file

It's working! Thanks anbu23! :slight_smile:

use diff command
check details from manual by typing 'man diff'

usage:

diff file1 file2

this will give you the lines at which the two files differ]
if file extent to thousand lines you can use more operator... like

diff file1 file2 | more

this will stop at the end of the screen and you can go to next page by pressing space bar and go to next line by pressing enter key..

enjoy UNIX

in perl,

#! /opt/third-party/bin/perl

open(FILE, "<", "first") || die "Unable to open first. <$!>\n";

while(<FILE>) {
  chomp;
  @split_arr = split(/ /, $_);
  my $dump;
  for( my $i = 1; $i <= $#split_arr + 1; $i++ ) {
    $dump .= ($split_arr[$i] . " ");
  }
  $fileHash{$split_arr[0]} = $dump;
}

close(FILE);

open(FILE, "<", "second") || die "Unable to open second. <$!>\n";

while(<FILE>) {
  chomp;
  @split_arr = split(/ /, $_);
  if ( exists $fileHash{$split_arr[0]} ) {
    @new_arr = split(/ /, $fileHash{$split_arr[0]});

    print "$split_arr[0] ";
    for( $i = 0; $i <= $#new_arr; $i++ ) {
      for( $j = 1; $j <= $#split_arr; $j++ ) {
        if( $new_arr[$i] =~ $split_arr[$j] ) {
          last;
        }
      }
      if( $j > $#split_arr ) {
        print "$new_arr[$i] ";
      }
    }
  }
  print "\n";
}

close(FILE);

exit 0

I'm using this line to find the difference between two files

To find data that exists in the file1 not exists in file2
diff file1 file2| grep '<' | tr -d '< '

To find data that exists in the file2 not exists in file1
diff file1 file2| grep '>' | tr -d '> '
----
Ismail