Load Balancing in UNIX

Zaib · February 21, 2017, 12:27pm

Dear All,
Can any one help me for this request?

There is a case. I have 20 files which I need to FTP to 5 servers. I want to know if there is any possibility to make a load balancer which transfers files in round robin manner to 5 servers.

As per theoretical algorithm, what I think, flow can be like:
There are 20 files from aaa to ttt and servers are from 1.1.1.1 till 5.5.5.5 .
I will write a code which will check number of files presence in a folder. Then will start picking files and transferring these to servers one by one.
Will start from file1 to server1 till file5 to server5 and
file6 to server1 again and so on.

I am using Solaris. Can anyone help how to achieve it?

jim_mcnamara · February 21, 2017, 4:19pm

[GNU Parallel

GNU Project - Free Software Foundation](GNU Parallel - GNU Project - Free Software Foundation)

Download parallel. It does exactly what you want. We had it on Solaris 10, worked fine. Note that there is a version for Solaris on the website.

Zaib · February 23, 2017, 6:55am

Thanks. I will check by keeping it aside. However, on live servers, we can't install any utility or patch for this.
So, I am trying to get idea to develop a script for this.

---------- Post updated 02-23-17 at 04:55 PM ---------- Previous update was 02-22-17 at 11:15 PM ----------

Any help on this please?

rbatte1 · February 23, 2017, 6:59am

Might I suggest an approach like this then:-

List your filenames into a work file
Split the list into 5 smaller job input files with split for the appropriate number of rows or if you can sort them by size, round-robin split them into 5 job input files
Initiate a job for each job input file in the background
Wait. Not a joke, but use the wait command to pause your script until all the transfers complete
Check the logs from your background jobs, should you wish to do so
End.

Does that structure help? If it does, have a go and let us know how far you get. Please share a working solution if you can so others finding this thread may learn from it.

If you get stuck, show us what you have so far and we will try to suggest options.

Kind regards,
Robin

Zaib · February 27, 2017, 3:09pm

Thanks rbatte1.
I got your idea.
I have placed file in one directory, listed and got output in a file.
Splitting done and now I have 3 files.

Unable to find a suitable way to create separate jobs.
one possibility is to create a sub-script for each file and then start it in background by using ampersand sign.
Is it a good approach?

MadeInGermany · February 27, 2017, 4:38pm

Do you want to transfer each file once i.e. to one random server, or many times i.e. to each server?

Zaib · February 28, 2017, 9:40am

My idea is to transfer multiple files to multiple servers. Let's say 20 files to 5 servers.
I want to start transfer in parallel to multiple hosts and once first 5 files are uploaded to all 5 servers then next chunk onwards. Hope you are getting my point.

bakunin · February 28, 2017, 10:02am

Create a Korn-Shell-script to do the job. Set up the ftp-jobs as separate functions inside the script and use the "coprocess-facility" to steer them. See the man page for ksh (either ksh88 or ksh93, both can do it) for details about co-processes.

(Note, that other shells lack this special feature, so you will need a real Korn shell to use it, neither bash, pdksh nor similar lookalikes.)

I hope this helps.

bakunin

MadeInGermany · March 1, 2017, 2:24pm

the following simulates a transfer of the files in the current directory to 5 servers.

#!/bin/bash
servers=(
 1.1.1.1 2.2.2.2
 3.3.3.3 4.4.4.4
 5.5.5.5
)
# /bin/ksh syntax:
#set -A servers 1.1.1.1 2.2.2.2 3.3.3.3 4.4.4.4 5.5.5.5

sindex=0
smax=${#servers[@]}

for file in *
do
 server=${servers[$sindex]}
 # execute the ( subshell ) in the background
 (
 echo "transfer $file to $server"
 sleep 10
 ) &
 sindex=$(( sindex + 1 ))
 if [ $sindex = $smax ] 
 then
  sindex=0
  wait
 fi
done

This method transfers each file once to one random server.
Replace the echo and sleep commands by a suitable transfer command.

---------- Post updated at 14:24 ---------- Previous update was at 13:43 ----------

Here is a refined version that ensures that always the maximum number of jobs are running.
Instead of wait ing for all background jobs to finish it polls the number of jobs in a loop.

#!/bin/bash
servers=(
 1.1.1.1 2.2.2.2
 3.3.3.3 4.4.4.4
 5.5.5.5
)
# /bin/ksh syntax:
#set -A servers 1.1.1.1 2.2.2.2 3.3.3.3 4.4.4.4 5.5.5.5

sindex=0
smax=${#servers[@]}
maxjobs=$smax

for file in *
do
 # ensure it is a file
 [ -f "$file" ] || continue
 server=${servers[$sindex]}
 # execute the ( subshell ) in the background
 (
 echo "transfer $file to $server"
 sleep $(( RANDOM % 10 ))
 ) &
 sindex=$(( (sindex + 1) % smax ))
 # poll the number of jobs until it becomes less than maxjobs
 until [ `jobs | wc -l` -lt $maxjobs ]
 do
  sleep 2
 done
done

Zaib · March 2, 2017, 3:21pm

Hi,
It is very much helpful. Indeed I learnt few new things as well.
Thanks for help. I am now finalizing it for my requirements.

Once done, will post here so others can also get help.