Transpose columns to Rows

aravindj80 · September 6, 2010, 5:40am

I have a data

A 1
B 2
C 3
D 4
E 5

i would like to change the data

A B C D E 
1 2 3 4 5

Pls suggest how we can do it in UNIX.

radoulov · September 6, 2010, 6:56am

awk 'END {
  for (i = 0; ++i <= NF;)
    for (j = 0; ++j <= NR;)
        printf "%s", _[j, i] (j < NR ? FS : RS)
    }
{
  for (i = 0; ++i <= NF;)
    _[NR, i] = $i
    }' infile

aravindj80 · September 6, 2010, 8:17am

i have a file with recurring fields

Start
A 1
B 2
C 3
D 4
E 5
End

Start
A 11
B 12
C 23
D 25
E 21
End

I would like to convert
A B C D E
1 2 3 4 5
11 12 23 25 21

Scrutinizer · September 6, 2010, 8:49am

awk 'NR==1{print $2,$4,$6,$8,$10}{print $3,$5,$7,$9,$11}' RS= infile

output:

A B C D E
1 2 3 4 5
11 12 23 25 21

rdcwayx · September 6, 2010, 9:42am

Follow radoulov's code:

awk '/Start/ {i++;a=1;next} /End/{a=0;next} {if (a==1) print > "file" i}' infile

join file* |awk 'END {
  for (i = 0; ++i <= NF;)
    for (j = 0; ++j <= NR;)
        printf "%s", _[j, i] (j < NR ? FS : RS)
    }
{
  for (i = 0; ++i <= NF;)
    _[NR, i] = $i
    }'

aravindj80 · September 7, 2010, 12:47am

Pls tell me the procedure to execute the below code.

quincyjones · January 16, 2011, 10:21pm

@Radoulov code

Code is great. is it possible to print tab delimited output from it?

durden_tyler · January 17, 2011, 2:07am

Or you could use Perl -

$
$
$ cat f3
Start
A 1
B 2
C 3
D 4
E 5
End

Start
A 11
B 12
C 23
D 25
E 21
End
$
$
$
$ perl -ane 'if (/^Start/) {$in=1}
             elsif (/^End/) {push @m, []; $in=0; $h=1}
             elsif ($in) {
               if (!$h) {push @{$m[0]},$F[0]; push @{$m[1]},$F[1]}
               else {push @{$m[$#m]}, $F[1]}
             }
             END {foreach $i (@m) {print join "\t",@$i,"\n"}}
            ' f3
A       B       C       D       E
1       2       3       4       5
11      12      23      25      21

$
$
$

tyler_durden

radoulov · January 17, 2011, 7:25am

Yes,
just define the FS:

awk -F'\t' ...

quincyjones · January 28, 2011, 2:51am

@Radoulov
Is it possible to apply the code to transpose 30000 rows and 1000 columns ?
When I apply your script nothing happened except my system freezing.

radoulov · January 28, 2011, 3:49am

What system are you using? Which awk implementation and version?
Could you try the code with a small sample, just to check if it's
really because of the size?

quincyjones · January 28, 2011, 7:51am

it works fine with small file. but not with the big file i specified in recent post. I'm using 4GB RAM - i5 processor MacBookpro. New version of awk I guess.

radoulov · January 28, 2011, 8:12am

Could you please post the output of the following command:

awk --version | head -1

quincyjones · January 28, 2011, 9:51am

awk version 20070501

radoulov · January 28, 2011, 10:49am

Seems the AT&T awk. Could you try to install the GNU awk?

kshji · January 28, 2011, 11:50am

This version not need array and mem, but more cpu and disk i/o.
1st col is filename and 2nd col is line in file.

#!/bin/ksh
delim=";"
rm -f *.key 2>/dev/null
cat file | while read key value xxx
do
        [ "$value" = "" ] && continue  # less than 2 value in line
        echo $value >> $key.key
done

maxsize=0   # how many lines we need = number of lines in file
for col in *.key
do
        size=$(cat $col | wc -l)
        ((size>maxsize)) && maxsize=$size
        colname=${col%%.*}  # remove last value after dot = name of col
        print -n "$colname$delim"
done
echo

line=1
while ((line<=maxsize))
do
        for col in *.key
        do
                value=$(sed -n "${line}p" $col 2>/dev/null )   # take Nth line from file
                print -n "$value$delim"
        done
        echo
        ((line+=1))
done

quincyjones · January 28, 2011, 6:26pm

@ rad:
Installed gak 3.1.8 version and still no output and my mac is screaming.

@kshji
How do I use your code ?
My file looks like

name a1  a2 .....a200
n1  0  2.4..........0.339
.
.
.
n200000  139.9  333339.9........0.989

Scrutinizer · January 28, 2011, 9:03pm

This code keeps rereading the file (so it is slower) but it does not use a lot of memory

awk 'NR==1{for(i=2;i<=NF;i++)ARGV[ARGC++]=ARGV[1]}
     FNR==1{print x;c++}{printf "%s"FS,$c}END{print x}' infile

So you perhaps you could give that a try. I have not tested it on large matrices, There should only be one input file with no empty lines in it.

kshji · January 29, 2011, 12:46am

My solution was solution for aravindj80 original data. Simplified solution from original.
Next is solution also if you have more cols / line. Only split cols to the keyfile, value/line. Solution include also data which I used to test.

cat <<EOF > $0.tmp
Start
A 1 K1 k1
B 2
C 3 T2 t4
D 4 C5 c4 cc6 hg55
E 5
End

Start
A 11
B 12
C 23 C5a c4a cc6a hg55a
D 25
E 21
End

Start
A 71
B 12
D 75
E 21
End
EOF

rm -f *.key 2>/dev/null
delim=";"
# 1st col values are now name of the files => name of the col and other values are lines in the file.
cat $0.tmp | while read key values
do
        [ "$values" = "" ] && continue
        for value in $values
        do
                echo $value >> $key.key
        done
done
# now data is in the files, printout

maxsize=0  # height of the output table = number of the lines in file
for col in *.key
do
        size=$(cat $col | wc -l)
        ((size>maxsize)) && maxsize=$size
        colname=${col%%.*}
        print -n "$colname$delim"
done
echo

# now we have height for the output table
line=1
while ((line<=maxsize))
do
        for col in *.key
        do
                value=$(sed -n "${line}p" $col)   # take Nth line from the file
                print -n "$value$delim"
        done
        echo
        ((line+=1))
done

ctsgnb · January 29, 2011, 3:55am

$ <abc tr ' ' '\n' | pr -5 -s' ' -t

$ cat abc
A 1
B 2
C 3
D 4
E 5
$ <abc tr ' ' '\n' | pr -5 -s' ' -t
A B C D E
1 2 3 4 5
$