Dsh command - shell script - sys args?

Sorry, a noobie question....!

I want to use a linux cluster to copy a list of files. I want to split the processing over 3 nodes so that each node gets (more or less) an equal share.

My script (base.sh) to execute my copy script (copy.sh) looks something like:

#!/bin/bash

for NODE in 1 2 3
do
        /sw/egs/bin/dsh -w node0${NODE} -e "/home/ts/scripts/copy.sh"

done

My copy.sh file is:

#/bin/bash

IDIR="/home/ts/test2/temp"
ODIR="/home/ts/test3/
FARRAY=( "$IDIR"/*.R )
COUNT=${#FARRAY[@]}

THIS_NODE=
TOTAL_NODES=3


for i in `seq 1 $COUNT`
do

        THIS_FILE=${FARRAY[$i]}
        REM=`expr $i % $TOTAL_NODES`


        if [ $REM -eq 0 ]
        then
        $REM = $TOTAL_NODES
        fi
    

        if [ $REM -eq $THIS_NODE ]
        then 
                cp $THIS_FILE $ODIR
        fi

done

My questions:

  1. How do I capture which node is running this job (THIS_NODE) in copy.sh?
  2. How can I modify the base.sh script so that the total number of nodes can also be passed into the copy.sh script? (sys args? - how?)

Is there a better/shorter/sleeker way to do what I am doing? Any other suggestions?

thanks!

Let's stop and think about this. Three CPU's can run programs faster, because they can run independently of each other. But how exactly are three CPU's going to speed up your disk? It doesn't matter if your CPU's can send disk-read commands faster than light, the disk can only do so much, and One CPU is going to max that out. Three may start it trashing -- slowing it down.

Additionally -- that you're having problems copying lots of files fast enough tells me there may be another problem here, like millions of files crammed in one folder, which multithreading cannot solve either.

There are a few very specific circumstances where multiple threads may speed this up -- a NAS with independent dedicated links, or some odd kinds of software RAID -- but I don't consider this likely without being told.

Back up and tell me more about your system and the problem you are trying to solve, please.

Ha, quite right.

I just use copy as an example. Instead of copy, each file will be subjected to some processing and some output from this will be written to disk.

Sorry, I should have mentioned...

1 Like

My apologies. I was having flashbacks to a thread dealing with ten million files in one directory; the OP refused to believe his disk couldn't be multithreaded via some magic perl or python code... :wall: Now that that's out of the way!

Do all three machines share the same disk? If not, dsh won't be useful here!

1) Your loops are overcomplicated. Instead of

ARR=( whatever ) ; for i in `seq ...`

do

for FILE in whatever/*
do
        ...
done

2) for FILE in whatever/*, or shell globbing in general, will fail with 'too many arguments' when there are large numbers of files. Better to use a utility like ls or find and print to a pipe when you dont' know how many files there are.

3) Don't have your sub-programs check which files are "theirs" -- tell them which files are theirs. Feed them into the program so they don't have to guess. This avoids problems with them getting out of sync (if the folder has a file added to the dir before one runs and after another runs, for example).

#!/bin/bash

NODES=3
N=0

# If there are thousands of files, '*.R' will fail in the shell with 'too many arguments'.
# So we use find instead, which prints to a pipe, avoiding using
# arguments at all.
find /home/ts/test2/temp/ -mindepth 1 -maxdepth 1 -type f -name '*.R' | while read FILE
do
        echo "file $FILE goes to node $N"
        echo "$FILE" >> /tmp/$$-$N
        let N=(N+1)%3
done

for ((N=0; N<NODES; N++))
do
        # I am assuming dsh can read from standard input here.
        # If this is wrong, that wont' work :(
        /sw/egs/bin/dsh -w node0${N} -e "/home/ts/scripts/copy.sh" < /tmp/$$-$N &
        rm /tmp/$$-$N
done

wait
#!/bin/bash

ODIR="/home/ts/test3/

while read FILE
do
        echo "Got file $FILE"
        echo cp "$FILE" "$ODIR"
done
1 Like

Thanks for the detailed explanation, Corona. I've learnt a lot!