Square matrix to columns

Hello all,

I am quite new in this but I need some help to keep going with my analysis.

I am struggling with a short script to read a square matrix and convert it in two collumns.

      A    B      C     D     
A  0.00  0.06  0.51   0.03 
B  0.06  0.00  0.72   0.48
C  0.51  0.72  0.00   0.01
D  0.03  0.48  0.01   0.00

This matrix is an example of the genetic distances of the same gene in different species.

Then, I need two collums where I can easily access any row and view the genetic distances existing between the different pairs (couples):

AA  0.00 
AB  0.06
AC  0.51
AD  0.03
BB  0.00
BC  0.72
BD  0.48
CC  0.00
CD  0.01
DD  0.00

Looks easy, but I didn�t get it.

Thanks a lot,

EvaAM

Try this:

cat matrix.txt
A B C D
A 0.00 0.06 0.51 0.03
B 0.06 0.00 0.72 0.48
C 0.51 0.72 0.00 0.01
D 0.03 0.48 0.01 0.00


cat matrix.txt| awk -F" " '{if(NR==1){c1=$1;c2=$2;c3=$3;c4=$4;}else{
if (c1<=$1) printf "%s%s %s\n",c1,$1,$2;
if (c2<=$1) printf "%s%s %s\n",c2,$1,$3;
if (c3<=$1) printf "%s%s %s\n",c3,$1,$4;
if (c4<=$1) printf "%s%s %s\n",c4,$1,$5;}}'

awk '
        NR == 1 {
                        split ( $0, H )
        }
        NR > 1 {
                        for ( i = 2; i <= NF; i++ )
                                print $1 H[i-1] OFS $i
        }
' matrix.txt
#! /usr/bin/perl -w
use strict;

my @arr = qw / A B C D /;
my ($fields, $i, $j) = ([], 0, 0);

open FH, "< file";
while (<FH>) {
    chomp;
    $fields->[$i] = [split /\s+/];
    $i++;
}
close FH;

for ($i = 0; $i <= 3; $i++) {
    for ($j = $i; $j <= 3; $j++) {
        print "$arr[$i]$arr[$j] $fields->[$i][$j]\n";
    }
}
[user@host ~]# cat file
0.00  0.06  0.51   0.03
0.06  0.00  0.72   0.48
0.51  0.72  0.00   0.01
0.03  0.48  0.01   0.00
[user@host ~]#
[user@host ~]# ./test.pl
AA 0.00
AB 0.06
AC 0.51
AD 0.03
BB 0.00
BC 0.72
BD 0.48
CC 0.00
CD 0.01
DD 0.00
[user@host ~]#

Misread the requirement! I guess a slight modification is required to produce the output that OP wants:

awk '
        BEGIN {
                        c = 2
        }
        NR == 1 {
                        split ( $0, H )
        }
        NR > 1 {
                        for ( i = c; i <= NF; i++ )
                        {
                                print $1 H[i-1] OFS $i
                        }
                        ++c
        }
' matrix.txt

Try:

awk 'NR==1{split($0,C); next} {for(i=NR; i<=NF; i++) print $1 C[i-1], $i}' file
2 Likes

Nice! :b:

Variable NR never crossed my mind!!

Thanks!

this worked perfectly

awk 'NR==1{split($0,C); next} {for(i=NR; i<=NF; i++) print $1 C[i-1], $i}' file

:b: