Subtracting columns against each other

Fredrick · February 25, 2010, 11:18am

Hi All,

I have a file of 100 lines of each having 1000 columns. I need to find the difference of each column against each other. That means, Col1-Col1; Col1-Col2; Col1-Col3;......Col1-Col1000; Col2-Col1; Col2-Col2; Col2-Col3;.... and so on ....up to Col1000-Col1000.

Lets say the file is having 5 lines of each having 5 columns. Input files as follows:

            Col1    Col2    Col3   Col4   Col5
Line1       A        B         C      D       E
Line2       A        B         C      D       E
Line3       A        B         C      D       E
Line4       A        B         C      D       E
Line5       A        B         C      D       E

The output I am expecting is as follows:

            Col1  Col2   Col3  Col4   Col5
Line1       0     A-B    A-C   A-D    A-E
Line2       0     A-B    A-C   A-D    A-E
Line3       0     A-B    A-C   A-D    A-E
Line4       0     A-B    A-C   A-D    A-E
Line5       0     A-B    A-C   A-D    A-E

Line6     B-A       0    A-C   A-D    A-E
Line7     B-A       0    A-C   A-D    A-E
Line8     B-A       0    A-C   A-D    A-E
Line9     B-A       0    A-C   A-D    A-E
Line10    B-A       0    A-C   A-D    A-E

.
.
.
.
.
.
.
.
Line25    E-A     E-B    E-C   E-D      0

For this i have used the following code

awk '{for(i=1; i<NF; i++) {for(j=1; j<NF; j++) {s=s FS $j-$i} print s;s=""}}{print "\n"}' infile > outfile

But I am not getting the result as i expected. I got the following result:

           Col1    Col2     Col3  Col4   Col5
 Line1       0     A-B      A-C   A-D    A-E
Line2      B-A       0      B-C   B-D    B-E
Line3      C-A     C-B        0   C-D    C-E
Line4      D-A     D-B      D-C     0    D-E
Line5      E-A     E-B      E-C   E-D      0

Line6        0     A-B      A-C   A-D    A-E
Line7      B-A       0      A-C   A-D    A-E
.
.
.
.
.
.
up to
.
.
Line25     E-A     E-B      E-C   E-D      0

But this is not the expected form. So, I need help from anyone to sort-out this problem.

Can anyone help me in this regard? Expecting your reply and thanks in advance.

Warm regards
Fredrick.

---------- Post updated at 04:58 PM ---------- Previous update was at 04:42 PM ----------

The above problem in a simplified way:

Inputfile:

1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1

Code:

awk '{for(i=1; i<NF; i++) {for(j=1; j<NF; j++) {s=s FS $j-$i} print s;s=""}}{print "\n"}' infile > outfile

Got the following output:
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0

0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0

Can anyone help me in this regard?

Thanks in advance.

Fredrick.

---------- Post updated at 05:18 PM ---------- Previous update was at 04:58 PM ----------

Since the above example input file is having the same values in all the columns, its not good to have that example.

I used the following as one more example:

Input file:
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5

Code:

awk '{for(i=1; i<NF; i++) {for(j=1; j<NF; j++) {s=s FS $j-$i} print s;s=""}}{print "\n"}' infile > outfile

Got the following as output:
0 1 2 3
-1 0 1 2
-2 -1 0 1
-3 -2 -1 0

0 1 2 3
-1 0 1 2
-2 -1 0 1
-3 -2 -1 0

Can anyone help me in this regard?

Warm regards
Fredrick.

pludi · February 25, 2010, 11:48am

You do know that with your requirements you'll get n^2 columns of output for any n columns of input (5 columns input -> 25 cols output, 1.000 input -> 1.000.000 output)

That aside, try if this fits you:

#!/usr/bin/perl -W

use strict;
use warnings;

while ( my $line = <DATA> ) {
    my @cols = split / /, $line;
    my ( $l, $r ) = ( 0, 0 );
    for ( $l = 0 ; $l <= $#cols ; $l++ ) {
        for ( $r = 0 ; $r <= $#cols ; $r++ ) {
            print $cols[$l] - $cols[$r], " ";
        }
    }
    print "\n";
}

__DATA__
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5

Exemplary output:

$ perl cols.pl
0 -1 -2 -3 -4 1 0 -1 -2 -3 2 1 0 -1 -2 3 2 1 0 -1 4 3 2 1 0 
0 -1 -2 -3 -4 1 0 -1 -2 -3 2 1 0 -1 -2 3 2 1 0 -1 4 3 2 1 0 
0 -1 -2 -3 -4 1 0 -1 -2 -3 2 1 0 -1 -2 3 2 1 0 -1 4 3 2 1 0 
0 -1 -2 -3 -4 1 0 -1 -2 -3 2 1 0 -1 -2 3 2 1 0 -1 4 3 2 1 0 
0 -1 -2 -3 -4 1 0 -1 -2 -3 2 1 0 -1 -2 3 2 1 0 -1 4 3 2 1 0

Fredrick · March 1, 2010, 11:29am

Hi Pludi,

Thank you very much for your reply. I have tried the following code:

#!/usr/bin/perl -W

use strict;
use warnings;

while ( my $line = <DATA> ) {
    my @cols = split / /, $line;
    my ( $l, $r ) = ( 0, 0 );
    for ( $l = 0 ; $l <= $#cols ; $l++ ) {
        for ( $r = 0 ; $r <= $#cols ; $r++ ) {
            print $cols[$l] - $cols[$r], " ";
        }
    }
    print "\n";
}

While executing the file, i am getting the error message as follows:

Name "main::DATA" used only once: possible typo at ./example.pl line 6.
readline() on unopened filehandle DATA at ./example.pl line 6.

Can you tell me, where is the mistake? As per the error message, in line 6 DATA has been used only once.

Expecting your reply and thanks in advance.

Warm regards
Fredrick.

ahmad.diab · March 1, 2010, 12:05pm

#!/usr/bin/perl -W

use strict;
use warnings;
open(DATA,"< infile.txt") ;
while ( my $line = <DATA> ) {
    my @cols = split / /, $line;
    my ( $l, $r ) = ( 0, 0 );
    for ( $l = 0 ; $l <= $#cols ; $l++ ) {
        for ( $r = 0 ; $r <= $#cols ; $r++ ) {
            print $cols[$l] - $cols[$r], " ";
        }
    }
    print "\n";
}

pludi · March 1, 2010, 3:36pm

fredrick:

While executing the file, i am getting the error message as follows:
Name "main::DATA" used only once: possible typo at ./example.pl line 6.
readline() on unopened filehandle DATA at ./example.pl line 6.
Can you tell me, where is the mistake? As per the error message, in line 6 DATA has been used only once.

I've been using the DATA pseudo-file (which is the whole section after the __DATA__ line in my original code) in lieu of a real file. You'll have to open() that yourself, or use <> instead of <DATA> if you're reading from stdin anyways.

Fredrick · March 2, 2010, 7:36am

Thank you ahmad.diab and pludi., its working fine.

warm regards
Fredrick.

---------- Post updated at 01:36 PM ---------- Previous update was at 09:57 AM ----------

Hi All,

To continue with the same problem, I would like to get the output in the following manner:

0 -1 -2 -3 -4 
0 -1 -2 -3 -4
0 -1 -2 -3 -4
0 -1 -2 -3 -4
0 -1 -2 -3 -4
1 0 -1 -2 -3  
1 0 -1 -2 -3  
1 0 -1 -2 -3  
1 0 -1 -2 -3 
1 0 -1 -2 -3 
2 1 0 -1 -2 
2 1 0 -1 -2 
2 1 0 -1 -2 
2 1 0 -1 -2 
2 1 0 -1 -2 
3 2 1 0 -1 
3 2 1 0 -1 
3 2 1 0 -1 
3 2 1 0 -1 
3 2 1 0 -1 
4 3 2 1 0 
4 3 2 1 0 
4 3 2 1 0 
4 3 2 1 0 
4 3 2 1 0

can anyone help me in this regard?

Expecting your reply and thanks in advance.

Warm regards
Fredrick.

pludi · March 2, 2010, 7:45am

I have to say, this starts to sound more and more like homework. That, or you're just plain lazy and expect us to do your work.