Fixed width file comparision not working

onesuri · February 8, 2013, 2:17am

when i used diff/cmp/cat -v commands i am getting the difference

cmp command 
cmp -l file1 file2 |head -1

1300 15 10

Manually checked records[vi +1300 filename]. record length and data matched.

diff file1 file2 

3C3
<record information

Manually checked records[vi +3 filename]. record length and data matched.

Checked with cat -v -to find any special charters 
Example 
cat -v "some value" file1   <--No special charters
cat -v "some value" file2   <--No special charters

is their any way we can compare fixed width files...
Thanks
oneSuri

RudiC · February 8, 2013, 4:40am

Pls post relevant parts of your files, e.g. some lines around line 3, and some around line 1300. Did you check for the literal string "record information" that diff printed for line 3?

user8 · February 8, 2013, 4:42am

You could try: md5sum
PS: I always use cat -A, rather than cat -v

onesuri · February 8, 2013, 5:41am

drl · February 11, 2013, 9:52am

Hi.

This solution relies on components docdiff and a short perl script:

#!/usr/bin/env bash

# @(#) s2	Demonstrate differences at character level.

pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C perl docdiff 

f1=data1
f2=data2
FILES="$f1 $f2"

pl " Input files $FILES"
head $FILES

pl " perl extraction helper script:"
cat p1

pl " Results, wdiff format, $f1, $f2:"
docdiff --wdiff --char $f1 $f2

pl " Results, wdiff format, $f1, $f2, extracted diff with labels:"
docdiff --wdiff --char $f1 $f2 |
./p1 $f1 $f2

pl " Results, wdiff format, $f2, $f1, extracted diff with labels:"
docdiff --wdiff --char $f2 $f1 |
./p1 $f2 $f1

exit 0

producing:

% ./s2

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
perl 5.10.0
docdiff 0.3.4

-----
 Input files data1 data2
==> data1 <==
orange
123456789xa
X-klystron

==> data2 <==
orange
123456780xb
Y-klystron

-----
 perl extraction helper script:
#!/usr/bin/env perl

# @(#) p1	Demonstrate wdiff difference format extraction with labels.

$f1 = shift || die " Missing first label.\n";
$f2 = shift || die " Missing second label.\n";

while (<>) {
  @a = m/\[-(.*?)-\]/xmsg;
  print "$f1: ", join( "", @a ), "\n" if defined @a;
  @b = m/\{\+(.*?)\+\}/xmsg;
  print "$f2: ", join( "", @b ), "\n" if defined @b;
}

exit(0);

-----
 Results, wdiff format, data1, data2:
orange
12345678[-9-]{+0+}x[-a-]{+b+}
[-X-]{+Y+}-klystron

-----
 Results, wdiff format, data1, data2, extracted diff with labels:
data1: 9a
data2: 0b
data1: X
data2: Y

-----
 Results, wdiff format, data2, data1, extracted diff with labels:
data2: 0b
data1: 9a
data2: Y
data1: X

The idea is that docdiff can print difference in resolution down to characters. The wdiff-style output is processed by the perl script. The data files were augmented to try to make sure that multiple lines could be processed as well as lines that were identical.

The docdiff utility is written in ruby, is available in Debian-based GNU/Linux repositories, and can also be found at DocDiff: Compare text word by word | Free Development software downloads at SourceForge.net

See man pages for details.

Best wishes ... cheers, drl (125)

---------- Post updated at 08:52 ---------- Previous update was at 08:10 ----------

Hi.

An all-perl solution:

#!/usr/bin/env perl

# @(#) p1	Demonstrate character differences in same-length lines.

use warnings;
use strict;

my (
  $f1, $f2, $file1, $file2, $i,       @a, @b,
  $s1, $s2, $t1,    $t2,    $changed, $debug
);

$f1 = shift || die " Missing first file.\n";
$f2 = shift || die " Missing second file.\n";

$debug = 1;
$debug = 0;

open( $file1, "<", $f1 ) || die " Cannot open file $f1\n";
open( $file2, "<", $f2 ) || die " Cannot open file $f2\n";
while ( $t1 = <$file1> ) {
  chomp($t1);
  @a = split "", $t1;
  $t2 = <$file2>;
  chomp($t2);
  @b = split "", $t2;
  print "file1,2 = ", join "", @a, " ", join "", @b, "\n" if $debug;
  $changed = 0;
  $s1 = $s2 = "";

  for ( $i = 0; $i <= $#a; $i++ ) {
    if ( $a[$i] ne $b[$i] ) {
      $s1 = "$f1: " if not $changed;
      $s2 = "$f2: " if not $changed;
      $s1 .= $a[$i];
      $s2 .= $b[$i];
      $changed++;
    }
  }
  print "$s1\n" if $changed;
  print "$s2\n" if $changed;
}

exit(0);

producing, using the data files noted above:

% ./p2 data1 data2
data1: 9a
data2: 0b
data1: X
data2: Y

Best wishes ... cheers, drl