I have a data
A 1
B 2
C 3
D 4
E 5
i would like to change the data
A B C D E
1 2 3 4 5
Pls suggest how we can do it in UNIX.
I have a data
A 1
B 2
C 3
D 4
E 5
i would like to change the data
A B C D E
1 2 3 4 5
Pls suggest how we can do it in UNIX.
awk 'END {
for (i = 0; ++i <= NF;)
for (j = 0; ++j <= NR;)
printf "%s", _[j, i] (j < NR ? FS : RS)
}
{
for (i = 0; ++i <= NF;)
_[NR, i] = $i
}' infile
i have a file with recurring fields
Start
A 1
B 2
C 3
D 4
E 5
End
Start
A 11
B 12
C 23
D 25
E 21
End
I would like to convert
A B C D E
1 2 3 4 5
11 12 23 25 21
awk 'NR==1{print $2,$4,$6,$8,$10}{print $3,$5,$7,$9,$11}' RS= infile
output:
A B C D E
1 2 3 4 5
11 12 23 25 21
Follow radoulov's code:
awk '/Start/ {i++;a=1;next} /End/{a=0;next} {if (a==1) print > "file" i}' infile
join file* |awk 'END {
for (i = 0; ++i <= NF;)
for (j = 0; ++j <= NR;)
printf "%s", _[j, i] (j < NR ? FS : RS)
}
{
for (i = 0; ++i <= NF;)
_[NR, i] = $i
}'
Pls tell me the procedure to execute the below code.
Or you could use Perl -
$
$
$ cat f3
Start
A 1
B 2
C 3
D 4
E 5
End
Start
A 11
B 12
C 23
D 25
E 21
End
$
$
$
$ perl -ane 'if (/^Start/) {$in=1}
elsif (/^End/) {push @m, []; $in=0; $h=1}
elsif ($in) {
if (!$h) {push @{$m[0]},$F[0]; push @{$m[1]},$F[1]}
else {push @{$m[$#m]}, $F[1]}
}
END {foreach $i (@m) {print join "\t",@$i,"\n"}}
' f3
A B C D E
1 2 3 4 5
11 12 23 25 21
$
$
$
tyler_durden
Yes,
just define the FS:
awk -F'\t' ...
@Radoulov
Is it possible to apply the code to transpose 30000 rows and 1000 columns ?
When I apply your script nothing happened except my system freezing.
What system are you using? Which awk implementation and version?
Could you try the code with a small sample, just to check if it's
really because of the size?
it works fine with small file. but not with the big file i specified in recent post. I'm using 4GB RAM - i5 processor MacBookpro. New version of awk I guess.
Could you please post the output of the following command:
awk --version | head -1
awk version 20070501
Seems the AT&T awk. Could you try to install the GNU awk?
This version not need array and mem, but more cpu and disk i/o.
1st col is filename and 2nd col is line in file.
#!/bin/ksh
delim=";"
rm -f *.key 2>/dev/null
cat file | while read key value xxx
do
[ "$value" = "" ] && continue # less than 2 value in line
echo $value >> $key.key
done
maxsize=0 # how many lines we need = number of lines in file
for col in *.key
do
size=$(cat $col | wc -l)
((size>maxsize)) && maxsize=$size
colname=${col%%.*} # remove last value after dot = name of col
print -n "$colname$delim"
done
echo
line=1
while ((line<=maxsize))
do
for col in *.key
do
value=$(sed -n "${line}p" $col 2>/dev/null ) # take Nth line from file
print -n "$value$delim"
done
echo
((line+=1))
done
@ rad:
Installed gak 3.1.8 version and still no output and my mac is screaming.
@kshji
How do I use your code ?
My file looks like
name a1 a2 .....a200
n1 0 2.4..........0.339
.
.
.
n200000 139.9 333339.9........0.989
This code keeps rereading the file (so it is slower) but it does not use a lot of memory
awk 'NR==1{for(i=2;i<=NF;i++)ARGV[ARGC++]=ARGV[1]}
FNR==1{print x;c++}{printf "%s"FS,$c}END{print x}' infile
So you perhaps you could give that a try. I have not tested it on large matrices, There should only be one input file with no empty lines in it.
My solution was solution for aravindj80 original data. Simplified solution from original.
Next is solution also if you have more cols / line. Only split cols to the keyfile, value/line. Solution include also data which I used to test.
cat <<EOF > $0.tmp
Start
A 1 K1 k1
B 2
C 3 T2 t4
D 4 C5 c4 cc6 hg55
E 5
End
Start
A 11
B 12
C 23 C5a c4a cc6a hg55a
D 25
E 21
End
Start
A 71
B 12
D 75
E 21
End
EOF
rm -f *.key 2>/dev/null
delim=";"
# 1st col values are now name of the files => name of the col and other values are lines in the file.
cat $0.tmp | while read key values
do
[ "$values" = "" ] && continue
for value in $values
do
echo $value >> $key.key
done
done
# now data is in the files, printout
maxsize=0 # height of the output table = number of the lines in file
for col in *.key
do
size=$(cat $col | wc -l)
((size>maxsize)) && maxsize=$size
colname=${col%%.*}
print -n "$colname$delim"
done
echo
# now we have height for the output table
line=1
while ((line<=maxsize))
do
for col in *.key
do
value=$(sed -n "${line}p" $col) # take Nth line from the file
print -n "$value$delim"
done
echo
((line+=1))
done
$ <abc tr ' ' '\n' | pr -5 -s' ' -t
$ cat abc
A 1
B 2
C 3
D 4
E 5
$ <abc tr ' ' '\n' | pr -5 -s' ' -t
A B C D E
1 2 3 4 5
$