Help with Running More than One Program

Folks,

I'm really new to scripting and was wondering if you could help me out. I have the following script that I inherited:

#!/bin/bash
#
# Usage
# From the agent directory:
# ./run-any-agent AgentName
#

TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
  CLASSPATH=${CLASSPATH}:$i
done

java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config config/$1.conf

So, when I type in

./run-any-agent MyProgram

it runs "MyProgram". However, since I have many programs that I have to run using this script at the same time, I typically open up many terminals and execute the script on the programs individually. However, this is beginning to be painful. So, I was wondering, if there's any way that I could type in something similar to:

/run-all-agents MyProgram1 MyProgram2 ... MyProgramN

And have them all run at the same time. A nudge in the right direction would be very much appreciated.

Thank you,

DTW

This should do the trick. '$*' holds the positional parameters that you use when calling "run-all-agents". It executes and starts "run-any-agent" in the background.

#!/bin/bash

for agent in $*
do
    ./run-any-agent $agent &   # run in background
done

2pugs,

Thanks for your reply.

So, if I understood this right, you're suggesting that I write another script called "run-all-agents" to execute the "run-any-agent" script - right?

What do you mean by "positional parameters"? So, will I be calling the script like this?

./run-all-agents MyProgram1 MyProgram2 ... MyProgramN

I'll mess with this a little and report back. It might also be nice to read the programs I want to run off a file (rather than "manually" typing them in). Anyway, I'll be back.

Thanks,

DTW

---------- Post updated at 04:01 PM ---------- Previous update was at 03:50 PM ----------

The script seems to run just fine. I tried running two programs for now using the command:

./run-all-agents MyProgram1 MyProgram2

However, I was wondering if there's a clean way to stop the programs, though. I used to be able to hit Ctrl+C to stop them but I'm not sure that is working very well now. They still run when I press the "Ctrl" key followed by the "C" key. Is there a script that can be written to halt the running programs cleanly too?

Thanks,

DTW

Sure, but it's a little tricky. You would first need to find out the PID of the agent you started. You could do this with the 'ps' command.

% ps -ef | grep "run-any-agent MyProgramN " | grep -v grep

  userid 13112 13110  4 15:10:03  ttyp12    00:00:00 run-any-agent MyProgramN

The PID in the above example is 13112. The tricky part is you don't want to leave any child/zombie processes that may have spawned from your script. If it does spawn extra processes then your script will have to find all the children and kill them in order. The last process you would kill would be 13112 since that is your original process. It's definitely doable though. Hope that makes sense.

** NOTE: Notice that I left a space after MyProgramN in my grep statement. This was done to ensure you had the right process for those cases when you might have MyProgram1 and MyProgram11 running. If you're looking for MyProgram1, then you want to make sure you don't accidentally catch MyProgram1N.

Thanks for your reply. Cool. So far I've been doing:

ps aux | more

And then I scroll through the list to find the PIDs. Sometimes when I use the "kill -9" command, though, it doesn't seem to kill the process. I'm not sure why.

% ps -ef | grep "run-any-agent MyProgramN " | grep -v grep

  userid 13112 13110  4 15:10:03  ttyp12    00:00:00 run-any-agent MyProgramN

I'm not sure what "-ef" and "-v" do but I can look them up.

Yes, it makes a lot of sense. My next question then would be: How do I find out all the children or zombie processes and then go ahead killing them nicely? This way, then I'd have a neat way (thanks to you) to start the programs and then another neat way to kill them all. It would save me a lot hassle. I must say that scripts are cool. :slight_smile:

Thanks,

DTW

P.S:

Hmm - I'd have to think about what you wrote, but, thank you for clearing that up.

---------- Post updated at 04:55 PM ---------- Previous update was at 04:34 PM ----------

So, I tried:

ps -ef | grep "run-any-agent MyProgram1 MyProgram2 MyProgram3 " | grep -v grep

After I typed:

./run-all-agents MyProgram1 MyProgram2 MyProgram3

But nothing happened...Did I miss something?

DTW

---------- Post updated at 05:15 PM ---------- Previous update was at 04:55 PM ----------

So, "-e" seems to be a way to use a pattern of some sort. I'm not sure what the "f" after the "e" does, really.
"-v" specifies the string that we don't want to grep.

grep abc -v

would mean look for all files that DON'T have "abc" in them? Does this sound correct?

DTW

You could try something like this:

#!/bin/bash
trap killsubs INT
killsubs()
{
  echo "CTRL-C was pressed"
  jobs -p|xargs kill
  echo "Jobs were killed"
  exit
}
for agent in "$@"
do
  ./run-any-agent $agent &   
done
wait

Scrutinizer,

Thanks you for your post. I'll check what you wrote out and report back. I also tried:

ps aux | grep run-any-agent 

And it returned:

DTW    31685  0.0  0.0   2988  1368 pts/7    S    16:48   0:00 /bin/bash ./run-any-agent MyProgram1
DTW    31686  0.0  0.0   2988  1372 pts/7    S    16:48   0:00 /bin/bash ./run-any-agent MyProgram2
DTW    31687  0.0  0.0   2988  1364 pts/7    S    16:48   0:00 /bin/bash ./run-any-agent MyProgram3

Perhaps that could be used somehow too?

Thanks,

DTW

---------- Post updated at 05:40 PM ---------- Previous update was at 05:27 PM ----------

I tried running this...It didn't apparently do anything that I could see. Hmm - interesting.

DTW

Did you run is with parameters?
Did you press Ctrl-C afterwards?

---------- Post updated at 23:51 ---------- Previous update was at 23:44 ----------

Sure you can kill all processes with that name:

kill $( ps aux|awk '/[r]un-any-agent/{print $2}' )

-or-

kill $(pgrep run-any-agent)

-or if you feel lucky today-

pkill run-any-agent

The last two commands are not available on every system.

No and no. Sorry! :slight_smile: Anyway, I redid the test and I got something like this:

Then I hit Ctrl+C. However, after a few seconds, the programs started back up. That was annoying! I'll look at your other suggestions and report back. (Though it probably won't be today since I need to be heading out; I'll get back tomorrow for sure.)

Thanks much,

DTW

---------- Post updated 01-21-10 at 10:32 AM ---------- Previous update was 01-20-10 at 06:07 PM ----------

I tried it again this morning. I typed:

 sh ./run-all-agents MyProgram1 MyProgram2

The programs started up nicely. Then, to kill them, I typed:

sh ./kill-all-agents MyProgram1 MyProgram2

And then pressed Ctrl+C. This is what I saw:

./kill-all-agents: line 4: 13954 Terminated              ./run-any-agent $agent
./kill-all-agents: line 4: 13955 Terminated              ./run-any-agent $agent

I was happy but that was short-lived; after a few seconds, the programs started back up. What gives?:confused: I tried typing the same thing a few times, but it did the same thing. The programs continue to run. What else can I try?

Thanks,

DTW

---------- Post updated at 10:41 AM ---------- Previous update was at 10:32 AM ----------

Just to be perfectly clear:

[1] The run-any-agent script has:

#!/bin/bash
#
# Usage
# From the agent directory:
#   sh ./run-agent
#

TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
  CLASSPATH=${CLASSPATH}:$i
done

java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config config/$1.conf 

[2] The run-all-agents script has:

#!/bin/bash

for agent in $*
do
    ./run-any-agent $agent &   # run in background
done

[3] The kill-all-agents has:

#!/bin/bash
trap killsubs INT
killsubs()
{
  echo "CTRL-C was pressed"
  jobs -p|xargs kill
  echo "Jobs were killed"
  exit
}
for agent in "$@"
do
  ./run-any-agent $agent &   # run in background
done
wait

Thanks,

DTW

---------- Post updated at 10:53 AM ---------- Previous update was at 10:41 AM ----------

I also tried all of the commands:
[1]

kill $( ps aux|awk '/[r]un-any-agent/{print $2}' )

[2]

kill $(pgrep run-any-agent)

[3]

pkill run-any-agent

But nothing seems to work. The programs keep going.

Thanks,

DTW

---------- Post updated at 11:34 AM ---------- Previous update was at 10:53 AM ----------

So, I eventually ended up restarting my machine...It looked like there were too many zombie processes for me to get rid of. I tried running the "run-all-agents" script again followed by the "kill-all-agents" but it seems impossible to stop the programs once they kick off.

DTW

---------- Post updated at 12:06 PM ---------- Previous update was at 11:34 AM ----------

I'm running out of ideas...I executed the following command:

ps aux | more

And then scanned my screen for any processes that I may have spawned using "run-any-agent". What I found surprising was that there were several instances of "run-any-agent" - each with different PIDs. Check these out:
[1]

DTW     5740  0.0  0.0   2988  1368 pts/0    S    11:30   0:00 /bin/bash ./run-any-agent MyProgram1
DTW     5741  0.0  0.0   2988  1372 pts/0    S    11:30   0:00 /bin/bash ./run-any-agent MyProgram2
DTW     5746  0.1  1.2 1197324 24304 pts/0   Sl   11:30   0:02 java -server -Xmx1024M -Xms512M -cp ...[There's a lot of more after this]

[2]

DTW     6264  0.0  0.0   2988  1364 pts/0    S    11:34   0:00 /bin/bash ./run-any-agent MyProgram1
DTW     6265  0.0  0.0   2988  1364 pts/0    S    11:34   0:00 /bin/bash ./run-any-agent MyProgram2
DTW     6270  0.1  1.1 1197184 23812 pts/0   Sl   11:34   0:02 java -server -Xmx1024M -Xms512M -cp...[There's a lot of more after this]

[3]

DTW     6317  0.0  0.0   2988  1372 pts/0    S    11:34   0:00 /bin/bash ./run-any-agent MyProgram1
DTW     6318  0.0  0.0   2988  1372 pts/0    S    11:34   0:00 /bin/bash ./run-any-agent MyProgram2
DTW     6323  0.1  1.1 1196944 23408 pts/0   Sl   11:34   0:02 java -server -Xmx1024M -Xms512M -cp...[There's a lot of more after this] 

I realized that killing these processes "manually" using the "kill -9" command doesn't seem to do anything. For example I did:

kill -9 6323

But I'm not sure anything happened. What should I do? Why are these processes being spawned so many times? I'm not sure what's going on here.:frowning:

Thanks,

DTW

If you are using script mentioned above & running all the programs in backgorund using & operator, you can use below command :

This will show all background process. The second column in the output will show process id. You can use

to kill as usual.

-Nithin

Cool. Thank you very much.

Thank you again.

---------- Post updated at 02:14 PM ---------- Previous update was at 02:11 PM ----------

So, I was looking at the kill-all-agents script here:

#!/bin/bash
trap killsubs INT
killsubs()
{
  echo "CTRL-C was pressed"
  jobs -p|xargs kill
  echo "Jobs were killed"
  exit
}
for agent in "$@"
do
  ./run-any-agent $agent &   # run in background
done
wait

I was wondering - do these lines really need to be there?

for agent in "$@"
do
  ./run-any-agent $agent &   # run in background
done

Aren't they starting the agents back up? I'm confused.

DTW

You should not use kill -9 (only as a very last resort). It causes zombie processes.
Perhaps the problem lies with the fact that the java process is a subprocess yet again.

What happens if you combine the two scripts into one and you use something like this:

#!/bin/bash
#
# Usage
# From the agent directory:
# ./run-all-agents AgentName1 AgentName2 ...
#
trap killsubs INT
killsubs()
{
  echo "CTRL-C was pressed"
  jobs -p|xargs kill
  echo "Jobs were killed"
  exit
}

TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
  CLASSPATH=${CLASSPATH}:$i
done

for i in "$@"
do
  java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config "config/$i.conf" &
done
wait

OK. Thank you for your thoughts.

Before I try your experiment, I commented out the following lines from your script:

for agent in "$@"
do
  ./run-any-agent $agent &   # run in background
done

I guess it didn't do anything...

Next, I restarted my machine and ran the "run-all-agents" script like this:

sh ./run-all-agents MyProgram1 MyProgram2 MyProgram3

Then, I went through all the processes using the following command:

ps aux | more

And killed the three processes, I'd spawned using the "kill -9" command. They died. However, this process doesn't seem to die (despite repeated attempts to kill it):

DTW    13986  0.1  1.0 1196768 21528 pts/0   Sl   14:21   0:00 java -server -Xmx1024M -Xms512M -cp ....

Now, I'll try restarting the machine and then try your experiment and report back.

Thanks,

DTW

---------- Post updated at 03:07 PM ---------- Previous update was at 02:56 PM ----------

#!/bin/bash
#
# Usage
# From the agent directory:
# ./run-all-agents AgentName1 AgentName2 ...
#
trap killsubs INT
killsubs()
{
  echo "CTRL-C was pressed"
  jobs -p|xargs kill
  echo "Jobs were killed"
  exit
}

TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
  CLASSPATH=${CLASSPATH}:$i
done

for i in "$@"
do
  java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config "config/$i.conf" &
done
wait

So, I did what you suggested and called the above script, "test-script1". I ran it like this:

 sh ./test-script1 MyProgram1 MyProgram2 MyProgram3

It started up. Then, when I hit "Ctrl+C" it stopped. Everything stopped nicely. I even did a "ps aux | more" command and looked at the running processes carefully but there was no sign of either of the three programs running. Also, there was no sign of the "java server" running - for the first time. :slight_smile: I can test this out and see if the programs are actually functional and report back. Does that sound fair?

DTW

---------- Post updated at 03:30 PM ---------- Previous update was at 03:07 PM ----------

Hi!
I'm back. I did try to run the programs too and they ran just fine. I tried stopping/starting the programs and they did so every single time. I'm so happy that this is finally working! Thank you very much for all your help, Scrutinizer.

I do have one more question, though. Before I start the programs, I fire up a server using the following script:

#!/bin/bash
#
# Usage
#   sh ./runServer.sh
#

TACAA_HOME=`pwd`
LIB=${TACAA_HOME}/lib
CLASSPATH=.
for i in $( ls ${LIB}/*.jar ); do
    CLASSPATH=${CLASSPATH}:$i
done


java -cp $CLASSPATH se.sics.tasim.sim.Main

So far, I've been able to "kill" the server by just hitting Ctrl+C. However, this doesn't always work. Is there a way to nicely kill this process too? That way, we'd have a script to nicely start/stop the server and then another one to nicely start/stop the programs.

My next goal (after the server start/stop script is implemented and tested) is to truly understand what we did so that I actually learn some scripting too. :slight_smile:

Sincerely,

DTW

Hi, glad it works out so well. Perhaps you could try the same thing with the server too, see how that pans out:

#!/bin/bash
#
# Usage
#   sh ./runServer.sh
#

trap killsubs INT
killsubs()
{
  echo "CTRL-C was pressed"
  jobs -p|xargs kill
  echo "Jobs were killed"
  exit
}

TACAA_HOME=`pwd`
LIB=${TACAA_HOME}/lib
CLASSPATH=.
for i in $( ls ${LIB}/*.jar ); do
    CLASSPATH=${CLASSPATH}:$i
done

java -cp $CLASSPATH se.sics.tasim.sim.Main &

wait

I tried out your next script too and it works like a charm! I used to spend hours trying to keep track of what was going on with programs. I often had so many terminals up and running it was just plain ridiculous. Also, the processes would never die and it used to mess up the experiments I was trying to run. Truly, thank you very much for all your precious help, Scrutinizer.

My next job will be to go through your scripts and try to comment them out in detail. I will be back with questions, I'm sure. So, stand by. :slight_smile:

Gratefully,

DTW

P.S: Also if you PM me your name, I can put it in as the primary author for these scripts. I really don't want to take the credit for work that I've not done.

---------- Post updated at 04:28 PM ---------- Previous update was at 03:53 PM ----------

Let's consider this script, first:

#!/bin/bash
#
# Usage
# From the agent directory:
# ./run-all-agents AgentName1 AgentName2 ...
#
trap killsubs INT
killsubs()
{
  echo "CTRL-C was pressed"
  jobs -p|xargs kill
  echo "Jobs were killed"
  exit
}

TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
  CLASSPATH=${CLASSPATH}:$i
done

for i in "$@"
do
  java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config "config/$i.conf" &
done
wait

OK, so, I'm trying to get an idea of the high-level flow here. According to me this is what is executed first:

TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
  CLASSPATH=${CLASSPATH}:$i
done

for i in "$@"
do
  java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config "config/$i.conf" &
done
wait

This essentially fires up the programs by taking them in as command line arguments and starting them each off as a background process. After it has done this, it waits. Then if Ctrl+C is executed, this piece of the code is executed:

trap killsubs INT
killsubs()
{
  echo "CTRL-C was pressed"
  jobs -p|xargs kill
  echo "Jobs were killed"
  exit
}

I think the "INT" stands for interrupts (which is caused by the user hitting Ctrl+C). Then, the routine "killsubs()" is called. This first outputs the line "Ctrl -C was pressed". Then it gets all the PIDs of the jobs. I'm not sure what -p does. On the man pages, we have:

What does that mean? "xargs" looks like it gets all the items from the standard input and then executes the command "kill" on them. Then, we inform the user that the jobs were killed and exit cleanly. Is that a fair first pass?

Thanks,

DTW

---------- Post updated at 05:45 PM ---------- Previous update was at 04:28 PM ----------

I have a quick question:

trap killsubs INT
killsubs()
{
  echo "CTRL-C was pressed"
  jobs -p|xargs kill
  echo "Jobs were killed"
  exit
}

Is there any way we could nicely print out the names of the jobs we killed? I tried putting in:

trap killsubs INT
killsubs()
{
  echo "CTRL+C was pressed"
  jobs -p|xargs echo
  jobs -p|xargs kill
  echo "Agents were killed!"
  exit
}

But this printed only the PIDs.

Thanks,

DTW

In my case, the job names were in 4th & 5th position of jobs -l command output. Below code will show the job name to the user before killing it.

trap killsubs INT
killsubs()
{
  echo "CTRL+C was pressed"
  jobs -l | awk '{ print $4,$5 }'
  jobs -p|xargs kill
  echo "Agents were killed!"
  exit
}

Hope this helps.
-Nithin.

Hey Nithin,

Thanks for your suggestion.

trap killsubs INT
killsubs()
{
  echo "CTRL+C was pressed"
  jobs -l | awk '{ print $i }'
  jobs -p|xargs kill
  echo "Agents were killed!"
  exit
}

I tried the following piece of code and changed i from 0 right up to 13 but I didn't see the names of my programs anywhere. :frowning: Is there anything else I can try?

Thanks,

DTW

---------- Post updated at 05:17 PM ---------- Previous update was at 02:41 PM ----------

So, I finally managed to get the names working. I'll post more, later. :slight_smile:

DTW

OK, so I'm back. Apparently, adding this line to the script did the trick:

jobs -p | while read pid;do ps -p $pid -oargs | perl -pe 's/.*?config\/(.*?).conf/$1/';done | grep -v COMMAND

I had a buddy help me write that out. I have no idea what it exactly does in detail, though. :slight_smile:
So, the full script is now:

#!/bin/bash
#
# Usage
# From the "Client" directory type:
# ./run-these-agents AgentName1 AgentName2 ...
#
trap killsubs INT
killsubs()
{
  echo
  echo "CTRL+C was pressed"
  echo "The following agents were killed!"
  jobs -p | while read pid;do ps -p $pid -oargs | perl -pe 's/.*?config\/(.*?).conf/$1/';done | grep -v COMMAND
  jobs -p|xargs kill
  exit
}

TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
  CLASSPATH=${CLASSPATH}:$i
done

for i in "$@"
do
  java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config "config/$i.conf" &
done
wait

Thanks to all who contributed in making this script work for me. I'm grateful for your help.

Sincerely,

DTW