Hi,
I'm trying to write a script to decompress a directory full of files. The decompression commands can run in the background, so that many can run at once. But I want to limit the number running at any one time, so that I don't overload the machine.
Something like this:
n=0
for i in *.gz
do
gzip -d $i &
n=$((n+1))
if [ $n -ge 10 ]; then
# XXX Not sure what to do here
fi
done
At the marked spot, I want to wait for one of my background processes to complete. I don't mind which one, but I do want to wait for just one.
wait doesn't work, as it waits for all jobs to complete. On the other hand, wait N doesn't work, because I don't know which job will finish first.
I could use trap "..." 20, but I'd need to be able to pause my script at the XXX line and be able to resume it via the "..." from the trap command. I can't think of a way of doing this ("suspend" in bash might work, but really I need this to work in ksh - I'm not sure the server this will ultimately run on has bash installed).
nope, I don't think you'll do it in ksh.
you'll need waitpid.
I would, get the list of files,
divide by the number of processes you want
and send that many files off via xargs
e.g. 10
set `ls *.gz` # sets $1 $2 $3 ...
# $# = the count
ls *.gz | xargs -n$(( $# / 10 )) gunzip
i don't know how set will react if you have hundreds of files
you might get 'command line too long'
Thanks, that's an approach I hadn't thought of. One thing it doesn't allow me to do is to report progress - something I'd thought of adding to my original approach was to add a "printf '.'" whenever I started a new decompress. But that's just a nice-to-have - your suggestion gets the job done.
I'd still be interested in any other possibilities that anyone can suggest - this is my first venture into anything more complicated than very basic scripts, and I'm learning a lot I didn't know!
I think this is too much for a shell and using C or at least some real scripting language may be required here. However I'd love to see a solution for shell if possible.
I tried a perl solution and got really bogged down because I couldn't find an easy way of running a background command (disclaimer: it's a VERY long time since I used perl, but I don't have Python on the box I'm working with :-() Messing round with
seems fraught with potential issues that I don't understand (for a start, it doesn't handle shell metacharacters - should I use exec "sh", "-c", @_ or some similar incantation?)
If someone can confirm a decent Perl equivalent of the shell
My basic idea for solution would be to spawn initial N workes and save their pids to some table, then sleep 1 and see which of the PIDs are still alive. For those who are not - spawn next worker and save pid in place of the old one. Repeat until job is done.
The problem with counting with pgrep is that you will take in account the processess that may not be related with the script (any other user can run their own gzip, right?).
You could put the procs in background and use the shell var $! to save the pid. Not sure if it'll help or if it would be as easy to implement as it sounds. If not, then you could try perl or any other high level scripting languages.
Does that look like a reasonable approach? It seems reasonably clean - although not as nice as my original ksh attempt (which had the disadvantage that it didn't work, of course :))
I'm not a perl expert, but you don't seem to loop through the all pids and check the values.
You should loop through all pids with non-blocking waitpid and it the process is not running - spawn a new one in place of the old one.
It seems to me that you are now waiting for the first pid in queue and when it finishes - sprawning another one. While quite good, the situation may be that from the first 10 pids the 2-9 have ended and 1 is working very long. You will end up with only one worker running for most of the time, which I guess you've tried to avoid.
Ah! It never occurred to me that I could use ps to check if workers were still running. That looks pretty good. And it looks like it will work just as well in ksh, too.
I'll give it a try. Thanks
---------- Post updated at 07:20 PM ---------- Previous update was at 04:06 PM ----------
I've now got a pretty neat solution in Perl (currently only tested on cygwin, but I see no reason it shouldn't work properly on "real" Unix). For those who might be interested, this is the final result:
#!/usr/bin/perl
use POSIX ":sys_wait_h";
%pids = ();
$npids = 0;
$MAX_CHILDREN = 10;
$SIG{CHLD} = \&REAPER;
sub REAPER {
my $child;
while (($child = waitpid(-1, WNOHANG)) > 0) {
# print "$child ($pids{$child}) died!!!\n";
delete $pids{$child};
$npids--;
# print "Child died ($npids still running)\n";
}
$SIG{CHLD} = \&REAPER;
}
sub launch {
my ($cmd) = @_;
if ($npids >= $MAX_CHILDREN) {
# print "Zzzz...\n";
sleep;
}
my $pid = fork;
unless ($pid) {
exec $cmd;
}
$pids{$pid} = $cmd;
$npids++;
}
sub waitall {
while ($npids) {
sleep;
}
}
for $i (<*.gz>) {
$cmd = "gzip -d \"$i\"";
print "Launching $cmd\n";
launch $cmd;
}
waitall;
Thanks to all who helped me with this! It's been an interesting exercise, and I learned a lot along the way
I'm not sure - sleep without an argument sleeps forever (until a signal comes in). So the intent of the statement is - if we have too many processes, pause. When SIGCHLD comes in, we are awakened, and control falls out of the if and (finally) starts the new process. If we don't have too many processes yet, just start a new process directly.
With an "else", wouldn't the case where we sleep then skip starting its process when it awakens?
Why not take your original concept and build in a throttle, e.g.:
par=10
offset=$(ps|wc -l)
max=$(( par + offset ))
for i in *.gz
do
# echo starting new process
gunzip "$i" &
while [ $(ps|wc -l) -ge $max ]
do
# echo "Throttling..."
sleep 1
done
done
wait