Parallel Execution on Multiple System

Hi All,

I am working on a project where I need to execute set of arguments (around 500) on a Simulator. If I execute this on one linux (RedHat 8.0) machine it will approximately takes 2~3 days. Since I am having more linux machines am thinking of executing these on different machines in parallel(i.e. Paralle Execution).

I tried for sanity basis following option which is already posted in this forum;
i.e. Script1.sh & Script2.sh & Script3.sh . But I didn't find much savings in this option.

I also tried downloading a tool-JOBQUEUE from site "Download jobqueue 0.04 for Linux - jobqueue is a program for executing jobs in parallel to complete all jobs as fast as possible. - Softpedia".
This is able to split execution on multiple system, but it won't take all arguments list set (like if arguments set list has 10, it will take 8/9 skips 1 r 2).

Could you please help me in solving the problem.

Thanks in advance

Regards,
123an

not enough info for me...

are you talking about the same script with 500 different sets of arguments?
do you have the arguments already somewhere or
will you generate them?
should the jobs run in series once on their individual machine?
or can they run concurrently with a maximum threshold?

is this just for benchmarking? or is this going to be a permanent run and
everything should take about the same time?

i'm thinking . . . . just create all the command lines....
dump them all into a file...

then have another script, read this file,
divide them equally into scripts for each machine....
rcp these scripts to the respective machines,
and kick 'em off.

Hi quirkasaurus,

Thanks for ur reply..:slight_smile:

As per ur reply i would like to give some more details about my problem,
I have 500 different set of arguments in a file(say list.txt). These arguments i need to pass to an executable (or an application; say "./applictn.out") which wil run and does the job taking each arguments. That means i will get 500 sets of outputs on executing whole system.
If I execute this using script serially(means one after other) execution wil take more time. so within a system i can execute this in parallel (i.e making them as background process) using xargs. Like
Syntax:xargs <utility> <arguments>
Given: xargs ./applictn.out list.txt..

This wil be executed in parallel on single system..

Now my problem is if i want to execute these sets(set of 500 arguments) on different linux systems, which should run in parallel. So that i should get output at lower amount of time.
parallelism i mean to say is, if mahine1 should take say 150 sets and start processing..
Machine 2 should take say 200 sets and starts processing..
machine 3 should take remaining in sets and should start working on it..

All machines should work in parallel...

Thanks..
123an

Sounds to me like you need to implement and manage a Linux cluster.

Have you looked at any of the Linux clustering software in our directory?

High Performance Computing - Links

With GNU xargs, you can use the -P option to run jobs in parallel. So

cat input.txt | xargs -n 1 -P 4 myprogram

would run your program with each line of input provided as the arguments to myprogram, and 4 jobs would be started at a time.

To distribute among several Linux nodes, you can build a cluster, as Neo suggested.

OR you can use ssh/rsh to distribute run the jobs on the other hosts. First, make a file containing the hostnames of your cluster. If a host has 2 CPUs, have the hostname entry twice. If it has 4 cores, have 4 entries of that hostname.

From here you can go in different directions, but ultimately, you run one rsh/ssh process her line in this hosts file.

Hi Neo and otheus,

Thanks for ur valuable replies....

I am currently exploring all options suggested here....

Thanks...

123an

Hi,

Thanks for ur replies...

I am trying to simulate the environment whatever has been discussed here using ssh.
I am facing following problem while using ssh. The same is explained below..

case1: ssh -f <machine_ip> <executable1> <arguments>
Here, arguments are listed in file and those wil be read during execution. In this scenario, the parallel execution works fine.

case2: ssh -f <machine_ip> <cp file1 file2> <executable2>
Here, the executable in 'case 2' won't take command line arguments as in case of 'case 1', i.e. executable wil always read input arguments from file with name "args".
so what i tried here is i have created multiple text files with different arguments, during ssh i tried to copy these text files to file "args" and tried to execute "executable 2". But i am receving an error saying "Cannot fork into background without a command to execute.".

Can anybody suggest me the solution for this???

Thanks,
123an

Well, case2 appears to be (1) a copy and (2) a command to run. You can do this as if the entire set of commands were passed to "sh -c" as a single argument. That is, put the set of commands, separated by semi-colons (:wink: in double (or single) quotes:

ssh root@fwlb2 "echo 1; echo 2"

The -f option makes no difference here.