Store 'for' loop iterations in memory

hi all,

I'm writting a simple bash script to do a recursive paste over >10000 files with two columns each.

The files looks like this:

A1BG    3
A1CF    3
A2M    3
A4GALT    5
AAAS    2
AACS    2
AADAT    2
AAGAB    4
AAK1    3
AAMP    2
AANAT    3
AARS    2
AARS2    3
AARSD1    2
...

And the code is this:

SPHERE=Sphere.matrix.txt
rm -rf $SPHERE
for i in `ls *.spheres | sort`; do
    if [ -f $SPHERE ]; then
        cut -f2 $i \
        | paste $SPHERE - > $SPHERE.tmp
        mv $SPHERE.tmp $SPHERE
    else
        cat $i > $SPHERE
    fi
done

it opens the first file and writes it to the output file. then it opens the second file, it takes the second colums and it pastes it to the output file, and so on...

the code works propperly, but it gets progresively slower because in each cycle it has to open and overwrite a bigger file.

if I could store the result of aech iteration in memory instead of in disk, I think that the performnace would be much faster.

could you please give me some guidance in this?

thank you so much!

Do those input files share the column 1 values? If so, did you consider the join command?

1 Like