nans
April 29, 2015, 8:26am
1
Hello,
I have a text file with 148 rows and 2532691 columns. I need to transpose the data. The command that I am using is
awk '
{
for (i=1; i<=NF; i++) {
a[NR,i] = $i
}
}
NF>p { p = NF }
END {
for(j=1; j<=p; j++) {
str=a[1,j]
for(i=2; i<=NR; i++){
str=str" "a[i,j];
}
print str
}
}' file.raw > output
does not work for me due to memory issues -> the job crashes over the server. Is there a better way to handle the data?
many thanks
sea
April 29, 2015, 9:12am
2
I dont know awk well enough, but i wouldnt think that it causes a memory issue.
This said, to me it seems that these parts might be the cause:
str=a[1,j]
str=str" "a[i,j];
Other than that, i dont see a
to be initalized at all.
hth
RudiC
April 29, 2015, 9:15am
3
Using intermediate files will relieve memory. Try
awk '{$1=$1; print > "FILE"NR}' FS="," OFS="\n" file.raw
paste FILE* > output
nans
April 29, 2015, 11:01am
4
Thank you, But still running into memory issues. This is my script
awk '{$1=$1; print > "FILE"NR}' FS="," OFS="\n" file.raw
awk '
{
for (i=1; i<=NF; i++) {
a[NR,i] = $i
}
}
NF>p { p = NF }
END {
for(j=1; j<=p; j++) {
str=a[1,j]
for(i=2; i<=NR; i++){
str=str" "a[i,j];
}
print str
}
}' FILE* | paste FILE* > output
RudiC
April 29, 2015, 12:23pm
5
I'm absolutely not sure why you tear apart my two commands and then complain it wouldn't work.
To prove it works, try smaller but representative sample files.
nans
April 30, 2015, 7:58am
6
well, because
awk '{$1=$1; print > "FILE"NR}' FS="," OFS="\n" file.raw
is only splitting the file into rows and not transposing it to columns. So my intenstion was to split the rows and transpose it and then join all the columns..
RudiC
April 30, 2015, 8:11am
7
What's your field separator? If space, drop the FS=","
. If other, use that for the FS variable.
1 Like