hi all,
I'm writting a simple bash script to do a recursive paste over >10000 files with two columns each.
The files looks like this:
A1BG 3
A1CF 3
A2M 3
A4GALT 5
AAAS 2
AACS 2
AADAT 2
AAGAB 4
AAK1 3
AAMP 2
AANAT 3
AARS 2
AARS2 3
AARSD1 2
...
And the code is this:
SPHERE=Sphere.matrix.txt
rm -rf $SPHERE
for i in `ls *.spheres | sort`; do
if [ -f $SPHERE ]; then
cut -f2 $i \
| paste $SPHERE - > $SPHERE.tmp
mv $SPHERE.tmp $SPHERE
else
cat $i > $SPHERE
fi
done
it opens the first file and writes it to the output file. then it opens the second file, it takes the second colums and it pastes it to the output file, and so on...
the code works propperly, but it gets progresively slower because in each cycle it has to open and overwrite a bigger file.
if I could store the result of aech iteration in memory instead of in disk, I think that the performnace would be much faster.
could you please give me some guidance in this?
thank you so much!