I'm really new to scripting and was wondering if you could help me out. I have the following script that I inherited:
#!/bin/bash
#
# Usage
# From the agent directory:
# ./run-any-agent AgentName
#
TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
CLASSPATH=${CLASSPATH}:$i
done
java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config config/$1.conf
So, when I type in
./run-any-agent MyProgram
it runs "MyProgram". However, since I have many programs that I have to run using this script at the same time, I typically open up many terminals and execute the script on the programs individually. However, this is beginning to be painful. So, I was wondering, if there's any way that I could type in something similar to:
This should do the trick. '$*' holds the positional parameters that you use when calling "run-all-agents". It executes and starts "run-any-agent" in the background.
#!/bin/bash
for agent in $*
do
./run-any-agent $agent & # run in background
done
I'll mess with this a little and report back. It might also be nice to read the programs I want to run off a file (rather than "manually" typing them in). Anyway, I'll be back.
Thanks,
DTW
---------- Post updated at 04:01 PM ---------- Previous update was at 03:50 PM ----------
The script seems to run just fine. I tried running two programs for now using the command:
./run-all-agents MyProgram1 MyProgram2
However, I was wondering if there's a clean way to stop the programs, though. I used to be able to hit Ctrl+C to stop them but I'm not sure that is working very well now. They still run when I press the "Ctrl" key followed by the "C" key. Is there a script that can be written to halt the running programs cleanly too?
The PID in the above example is 13112. The tricky part is you don't want to leave any child/zombie processes that may have spawned from your script. If it does spawn extra processes then your script will have to find all the children and kill them in order. The last process you would kill would be 13112 since that is your original process. It's definitely doable though. Hope that makes sense.
** NOTE: Notice that I left a space after MyProgramN in my grep statement. This was done to ensure you had the right process for those cases when you might have MyProgram1 and MyProgram11 running. If you're looking for MyProgram1, then you want to make sure you don't accidentally catch MyProgram1N.
Thanks for your reply. Cool. So far I've been doing:
ps aux | more
And then I scroll through the list to find the PIDs. Sometimes when I use the "kill -9" command, though, it doesn't seem to kill the process. I'm not sure why.
I'm not sure what "-ef" and "-v" do but I can look them up.
Yes, it makes a lot of sense. My next question then would be: How do I find out all the children or zombie processes and then go ahead killing them nicely? This way, then I'd have a neat way (thanks to you) to start the programs and then another neat way to kill them all. It would save me a lot hassle. I must say that scripts are cool.
Thanks,
DTW
P.S:
Hmm - I'd have to think about what you wrote, but, thank you for clearing that up.
---------- Post updated at 04:55 PM ---------- Previous update was at 04:34 PM ----------
---------- Post updated at 05:15 PM ---------- Previous update was at 04:55 PM ----------
So, "-e" seems to be a way to use a pattern of some sort. I'm not sure what the "f" after the "e" does, really.
"-v" specifies the string that we don't want to grep.
grep abc -v
would mean look for all files that DON'T have "abc" in them? Does this sound correct?
#!/bin/bash
trap killsubs INT
killsubs()
{
echo "CTRL-C was pressed"
jobs -p|xargs kill
echo "Jobs were killed"
exit
}
for agent in "$@"
do
./run-any-agent $agent &
done
wait
No and no. Sorry! Anyway, I redid the test and I got something like this:
Then I hit Ctrl+C. However, after a few seconds, the programs started back up. That was annoying! I'll look at your other suggestions and report back. (Though it probably won't be today since I need to be heading out; I'll get back tomorrow for sure.)
Thanks much,
DTW
---------- Post updated 01-21-10 at 10:32 AM ---------- Previous update was 01-20-10 at 06:07 PM ----------
I tried it again this morning. I typed:
sh ./run-all-agents MyProgram1 MyProgram2
The programs started up nicely. Then, to kill them, I typed:
sh ./kill-all-agents MyProgram1 MyProgram2
And then pressed Ctrl+C. This is what I saw:
./kill-all-agents: line 4: 13954 Terminated ./run-any-agent $agent
./kill-all-agents: line 4: 13955 Terminated ./run-any-agent $agent
I was happy but that was short-lived; after a few seconds, the programs started back up. What gives? I tried typing the same thing a few times, but it did the same thing. The programs continue to run. What else can I try?
Thanks,
DTW
---------- Post updated at 10:41 AM ---------- Previous update was at 10:32 AM ----------
Just to be perfectly clear:
[1] The run-any-agent script has:
#!/bin/bash
#
# Usage
# From the agent directory:
# sh ./run-agent
#
TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
CLASSPATH=${CLASSPATH}:$i
done
java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config config/$1.conf
[2] The run-all-agents script has:
#!/bin/bash
for agent in $*
do
./run-any-agent $agent & # run in background
done
[3] The kill-all-agents has:
#!/bin/bash
trap killsubs INT
killsubs()
{
echo "CTRL-C was pressed"
jobs -p|xargs kill
echo "Jobs were killed"
exit
}
for agent in "$@"
do
./run-any-agent $agent & # run in background
done
wait
Thanks,
DTW
---------- Post updated at 10:53 AM ---------- Previous update was at 10:41 AM ----------
But nothing seems to work. The programs keep going.
Thanks,
DTW
---------- Post updated at 11:34 AM ---------- Previous update was at 10:53 AM ----------
So, I eventually ended up restarting my machine...It looked like there were too many zombie processes for me to get rid of. I tried running the "run-all-agents" script again followed by the "kill-all-agents" but it seems impossible to stop the programs once they kick off.
DTW
---------- Post updated at 12:06 PM ---------- Previous update was at 11:34 AM ----------
I'm running out of ideas...I executed the following command:
ps aux | more
And then scanned my screen for any processes that I may have spawned using "run-any-agent". What I found surprising was that there were several instances of "run-any-agent" - each with different PIDs. Check these out:
[1]
DTW 5740 0.0 0.0 2988 1368 pts/0 S 11:30 0:00 /bin/bash ./run-any-agent MyProgram1
DTW 5741 0.0 0.0 2988 1372 pts/0 S 11:30 0:00 /bin/bash ./run-any-agent MyProgram2
DTW 5746 0.1 1.2 1197324 24304 pts/0 Sl 11:30 0:02 java -server -Xmx1024M -Xms512M -cp ...[There's a lot of more after this]
[2]
DTW 6264 0.0 0.0 2988 1364 pts/0 S 11:34 0:00 /bin/bash ./run-any-agent MyProgram1
DTW 6265 0.0 0.0 2988 1364 pts/0 S 11:34 0:00 /bin/bash ./run-any-agent MyProgram2
DTW 6270 0.1 1.1 1197184 23812 pts/0 Sl 11:34 0:02 java -server -Xmx1024M -Xms512M -cp...[There's a lot of more after this]
[3]
DTW 6317 0.0 0.0 2988 1372 pts/0 S 11:34 0:00 /bin/bash ./run-any-agent MyProgram1
DTW 6318 0.0 0.0 2988 1372 pts/0 S 11:34 0:00 /bin/bash ./run-any-agent MyProgram2
DTW 6323 0.1 1.1 1196944 23408 pts/0 Sl 11:34 0:02 java -server -Xmx1024M -Xms512M -cp...[There's a lot of more after this]
I realized that killing these processes "manually" using the "kill -9" command doesn't seem to do anything. For example I did:
kill -9 6323
But I'm not sure anything happened. What should I do? Why are these processes being spawned so many times? I'm not sure what's going on here.
---------- Post updated at 02:14 PM ---------- Previous update was at 02:11 PM ----------
So, I was looking at the kill-all-agents script here:
#!/bin/bash
trap killsubs INT
killsubs()
{
echo "CTRL-C was pressed"
jobs -p|xargs kill
echo "Jobs were killed"
exit
}
for agent in "$@"
do
./run-any-agent $agent & # run in background
done
wait
I was wondering - do these lines really need to be there?
for agent in "$@"
do
./run-any-agent $agent & # run in background
done
Aren't they starting the agents back up? I'm confused.
You should not use kill -9 (only as a very last resort). It causes zombie processes.
Perhaps the problem lies with the fact that the java process is a subprocess yet again.
What happens if you combine the two scripts into one and you use something like this:
#!/bin/bash
#
# Usage
# From the agent directory:
# ./run-all-agents AgentName1 AgentName2 ...
#
trap killsubs INT
killsubs()
{
echo "CTRL-C was pressed"
jobs -p|xargs kill
echo "Jobs were killed"
exit
}
TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
CLASSPATH=${CLASSPATH}:$i
done
for i in "$@"
do
java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config "config/$i.conf" &
done
wait
Before I try your experiment, I commented out the following lines from your script:
for agent in "$@"
do
./run-any-agent $agent & # run in background
done
I guess it didn't do anything...
Next, I restarted my machine and ran the "run-all-agents" script like this:
sh ./run-all-agents MyProgram1 MyProgram2 MyProgram3
Then, I went through all the processes using the following command:
ps aux | more
And killed the three processes, I'd spawned using the "kill -9" command. They died. However, this process doesn't seem to die (despite repeated attempts to kill it):
Now, I'll try restarting the machine and then try your experiment and report back.
Thanks,
DTW
---------- Post updated at 03:07 PM ---------- Previous update was at 02:56 PM ----------
#!/bin/bash
#
# Usage
# From the agent directory:
# ./run-all-agents AgentName1 AgentName2 ...
#
trap killsubs INT
killsubs()
{
echo "CTRL-C was pressed"
jobs -p|xargs kill
echo "Jobs were killed"
exit
}
TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
CLASSPATH=${CLASSPATH}:$i
done
for i in "$@"
do
java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config "config/$i.conf" &
done
wait
So, I did what you suggested and called the above script, "test-script1". I ran it like this:
sh ./test-script1 MyProgram1 MyProgram2 MyProgram3
It started up. Then, when I hit "Ctrl+C" it stopped. Everything stopped nicely. I even did a "ps aux | more" command and looked at the running processes carefully but there was no sign of either of the three programs running. Also, there was no sign of the "java server" running - for the first time. I can test this out and see if the programs are actually functional and report back. Does that sound fair?
DTW
---------- Post updated at 03:30 PM ---------- Previous update was at 03:07 PM ----------
Hi!
I'm back. I did try to run the programs too and they ran just fine. I tried stopping/starting the programs and they did so every single time. I'm so happy that this is finally working! Thank you very much for all your help, Scrutinizer.
I do have one more question, though. Before I start the programs, I fire up a server using the following script:
#!/bin/bash
#
# Usage
# sh ./runServer.sh
#
TACAA_HOME=`pwd`
LIB=${TACAA_HOME}/lib
CLASSPATH=.
for i in $( ls ${LIB}/*.jar ); do
CLASSPATH=${CLASSPATH}:$i
done
java -cp $CLASSPATH se.sics.tasim.sim.Main
So far, I've been able to "kill" the server by just hitting Ctrl+C. However, this doesn't always work. Is there a way to nicely kill this process too? That way, we'd have a script to nicely start/stop the server and then another one to nicely start/stop the programs.
My next goal (after the server start/stop script is implemented and tested) is to truly understand what we did so that I actually learn some scripting too.
Hi, glad it works out so well. Perhaps you could try the same thing with the server too, see how that pans out:
#!/bin/bash
#
# Usage
# sh ./runServer.sh
#
trap killsubs INT
killsubs()
{
echo "CTRL-C was pressed"
jobs -p|xargs kill
echo "Jobs were killed"
exit
}
TACAA_HOME=`pwd`
LIB=${TACAA_HOME}/lib
CLASSPATH=.
for i in $( ls ${LIB}/*.jar ); do
CLASSPATH=${CLASSPATH}:$i
done
java -cp $CLASSPATH se.sics.tasim.sim.Main &
wait
I tried out your next script too and it works like a charm! I used to spend hours trying to keep track of what was going on with programs. I often had so many terminals up and running it was just plain ridiculous. Also, the processes would never die and it used to mess up the experiments I was trying to run. Truly, thank you very much for all your precious help, Scrutinizer.
My next job will be to go through your scripts and try to comment them out in detail. I will be back with questions, I'm sure. So, stand by.
Gratefully,
DTW
P.S: Also if you PM me your name, I can put it in as the primary author for these scripts. I really don't want to take the credit for work that I've not done.
---------- Post updated at 04:28 PM ---------- Previous update was at 03:53 PM ----------
Let's consider this script, first:
#!/bin/bash
#
# Usage
# From the agent directory:
# ./run-all-agents AgentName1 AgentName2 ...
#
trap killsubs INT
killsubs()
{
echo "CTRL-C was pressed"
jobs -p|xargs kill
echo "Jobs were killed"
exit
}
TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
CLASSPATH=${CLASSPATH}:$i
done
for i in "$@"
do
java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config "config/$i.conf" &
done
wait
OK, so, I'm trying to get an idea of the high-level flow here. According to me this is what is executed first:
TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
CLASSPATH=${CLASSPATH}:$i
done
for i in "$@"
do
java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config "config/$i.conf" &
done
wait
This essentially fires up the programs by taking them in as command line arguments and starting them each off as a background process. After it has done this, it waits. Then if Ctrl+C is executed, this piece of the code is executed:
trap killsubs INT
killsubs()
{
echo "CTRL-C was pressed"
jobs -p|xargs kill
echo "Jobs were killed"
exit
}
I think the "INT" stands for interrupts (which is caused by the user hitting Ctrl+C). Then, the routine "killsubs()" is called. This first outputs the line "Ctrl -C was pressed". Then it gets all the PIDs of the jobs. I'm not sure what -p does. On the man pages, we have:
What does that mean? "xargs" looks like it gets all the items from the standard input and then executes the command "kill" on them. Then, we inform the user that the jobs were killed and exit cleanly. Is that a fair first pass?
Thanks,
DTW
---------- Post updated at 05:45 PM ---------- Previous update was at 04:28 PM ----------
I have a quick question:
trap killsubs INT
killsubs()
{
echo "CTRL-C was pressed"
jobs -p|xargs kill
echo "Jobs were killed"
exit
}
Is there any way we could nicely print out the names of the jobs we killed? I tried putting in:
trap killsubs INT
killsubs()
{
echo "CTRL+C was pressed"
jobs -p|xargs echo
jobs -p|xargs kill
echo "Agents were killed!"
exit
}
trap killsubs INT
killsubs()
{
echo "CTRL+C was pressed"
jobs -l | awk '{ print $i }'
jobs -p|xargs kill
echo "Agents were killed!"
exit
}
I tried the following piece of code and changed i from 0 right up to 13 but I didn't see the names of my programs anywhere. Is there anything else I can try?
Thanks,
DTW
---------- Post updated at 05:17 PM ---------- Previous update was at 02:41 PM ----------
So, I finally managed to get the names working. I'll post more, later.
OK, so I'm back. Apparently, adding this line to the script did the trick:
jobs -p | while read pid;do ps -p $pid -oargs | perl -pe 's/.*?config\/(.*?).conf/$1/';done | grep -v COMMAND
I had a buddy help me write that out. I have no idea what it exactly does in detail, though.
So, the full script is now:
#!/bin/bash
#
# Usage
# From the "Client" directory type:
# ./run-these-agents AgentName1 AgentName2 ...
#
trap killsubs INT
killsubs()
{
echo
echo "CTRL+C was pressed"
echo "The following agents were killed!"
jobs -p | while read pid;do ps -p $pid -oargs | perl -pe 's/.*?config\/(.*?).conf/$1/';done | grep -v COMMAND
jobs -p|xargs kill
exit
}
TAC_AGENT_HOME=`pwd`
LIB=${TAC_AGENT_HOME}/lib
CLASSPATH=.
CLASSPATH=${CLASSPATH}:${TAC_AGENT_HOME}/bin
for i in $( ls ${LIB}/*.jar ); do
CLASSPATH=${CLASSPATH}:$i
done
for i in "$@"
do
java -server -Xmx1024M -Xms512M -cp $CLASSPATH edu.umich.eecs.tac.aa.agentware.Main -config "config/$i.conf" &
done
wait
Thanks to all who contributed in making this script work for me. I'm grateful for your help.