While loop - how to run processes one after another (2nd starts after first completes, and so on)

pathunkathunk · February 19, 2013, 4:28pm

I'm a programming noob. I'm trying to run a memory intensive process for many files. But when I use the following script, it runs fine for the first 5-7 files, then runs out of memory. Monitoring the output files, it's clear the processes are going on in parallel. Once 5-7 of the files are being run at once, I get a crash. I'd like to run the script so that the first process completes before the next one starts (or the effective equivalent). Any pointers?

#!/bin/bash

i=1

while [ $i -le 197 ]
do
my.program.command.here
i=$[$i+1]
done

MadeInGermany · February 19, 2013, 4:38pm

The loop runs as one process, line by line.
What exactly is

?
It must not end with a & character.

You can mitigate your problem by inserting a sleep command in the while loop.

pathunkathunk · February 19, 2013, 4:54pm

Sorry, I didn't think the program was relevant. I'm running software called usearch.

#!/bin/bash

i=1

while [ $i -le 197 ]
do
usearch -usearch_global file.$i -db db.file -strand both -id 0.5 -userout output.$i.tab 
i=$[$i+1]
done

I hadn't heard of the sleep command. So something like this would delay the processing of file.2 by 45 seconds from the initiation of the program on file.1?

#!/bin/bash

i=1

while [ $i -le 197 ]
do
usearch -usearch_global file.$i -db db.file -strand both -id 0.5 -userout output.$i.tab 
sleep 45
i=$[$i+1]
done

DGPickett · February 19, 2013, 5:11pm

Does your file spin off a child and then exit, so it creates a parallel situation. Sometimes you can stifle that by attaching all the child stdout and stderr until they close:

usearch -usearch_global file.$i -db db.file -strand both -id 0.5 -userout output.$i.tab 2>&1 | cat -u

No sleep is necessary if they run one at a time.

pathunkathunk · February 19, 2013, 6:16pm

I've tried both of these and for whatever reason neither has worked. Using sleep allowed the process to get through 12 files, but it crashed on the 13th. Without sleep it gets through 5-6 files. The output solution didn't seem to make a difference.

RudiC · February 20, 2013, 4:26am

Unless you present more details about what usearch does and how it works, we (and you) are doomed.
One - very inelegant - workaround would be to create nested loops (1 <= i <= 20) and (1 <= j <= 5) and run five instances of usearch at a time, wait for them to finish, and then go on with the next group.

MadeInGermany · February 20, 2013, 6:32am

Run the first 5 usearch commands, then run

ps -eo rss,vsz,pid,comm | sort -n | tail
ps -eo pcpu,pid,comm | sort -n | tail

to present us the top 10 processes, and

uname

pathunkathunk · February 20, 2013, 11:05am

Below is the terminal part of stdout following including the status of the usearch on the fifth file, followed by output of the following three commands, which I appended to the end of my script.

Unfortunately, it's not clear to me what information about usearch would be useful to your efforts to help me. It's a program that aligns sequences in one file to sequences in another file. 32 bit (the 64 bit version would likely do what I need, but it's expensive whereas the 32 bit version is free).USEARCH manual

Thanks for your efforts here.

ps -eo rss,vsz,pid,comm | sort -n | tail
ps -eo pcpu,pid,comm | sort -n | tail
uname

DGPickett · February 20, 2013, 12:43pm

Disk is cheap, can you make more swap on some other local file system? http://www.cyberciti.biz/faq/linux-add-a-swap-file-howto/

Corona688 · February 20, 2013, 1:04pm

00:14 2.1Gb 100.0% Searching, 12.4% matched

That's interesting, how the process has managed to rename itself 'searching'. I wonder if it forks every once in a while to tell ps how much farther it's gotten...

Whatever it is, it's on the hair's edge of biting the 32-bit memory limit and dying... There may be a reason the 32-bit version is free. More swap won't help, since there's still a per-process limit in 32-bit.

Does this process quit after it's finished? You can pgrep for 'Searching'...

DGPickett · February 20, 2013, 1:15pm

If the limit is per process, allocating all of the 4G address space, then running in parallel or not has no effect, and yes, swap does not add address space.

If they crash because they exhaust swap running in parallel, more swap may help, but in the end they may thrash with excess paging unless run singly.

It is pecular that the process gives its parent the slip, almost like it is registering the search with a server and exiting.

Corona688 · February 20, 2013, 2:04pm

4 gigs in theory, in practice often half of that due to what ranges have been reserved for what purposes. No more room for more heap but plenty left for stack, etc.

DGPickett · February 20, 2013, 2:23pm

Yes, they do no seem to be flexible about allowing you either a 3.9G heap, stack or mmap(). Even in 64 bit compilers, there are options for several different pointer options with varying sub-64-bit address ranges.

alister · February 20, 2013, 3:17pm

One (hacky) method to force each iteration of the loop to wait until usearch and all of its children have exited is to use strace -f .

A simple demonstration ...

sneaky.sh: prints 1 and exits while a sleeping subshell is left behind to print 2 at a later time.

#!/bin/sh

echo 1
( sleep 5; echo 2 ) &

loop.sh: runs sneaky.sh, almost certainly completing all loop iterations before any of the sleeping subshells awake.

#!/bin/sh

for i in 1 2 3; do
    ./sneaky.sh
done

Sample run:

$ ./loop.sh
1
1
1
$2
2
2

loop-strace.sh: exploits the fact that strace -f does not return until all children have exited:

#!/bin/sh

for i in 1 2 3; do
    strace -f -o /dev/null -e trace=process ./sneaky.sh
done

Sample run:

$ ./loop-strace.sh
1
2
1
2
1
2
$

If you are interested in seeing the processes that usearch is creating and how it's managing them, use a real file instead of /dev/null.

Regards,
Alister

DGPickett · February 20, 2013, 3:42pm

Well, strace does add overhead, so you might look at the strace to see what is exec'd that you could test ps for. It is probably on your terminal even if disconnected from your shell.

If any of the files is per-run and is closed at the end, you might be able to make a named pipe the target and track its life by the cat of the output out of the named pipe. Strace can tell you what is being closed at the end, too.

Calling fuser on files can find user processes, but it is a bit slow.

MadeInGermany · February 20, 2013, 4:02pm

No this is not from the ps output, but certainly from the previous command. The ps list follows, and only has got 10 items.
(BTW the process can change its *ARGV[0] string.)

I have not yet seen an indication that usearch directly spawns processes.
If these are spawned by a service-daemon, the strace won't show anything useful.
What gives

pstree -A

?
shortly after running a usearch command?

So far, it is possible that the processes trigger a VM bug in the kernel, and the system runs out of memory without any processes causing it.
What gives

uname -a

?

DGPickett · February 21, 2013, 11:16am

Well, strace could tell us the exact fault.

mvona · February 23, 2013, 1:57am

Could try this...

#!/bin/bash

i=1

while [ $i -le 197 ]
do
usearch -usearch_global file.$i -db db.file -strand both -id 0.5 -userout output.$i.tab 
wait $!
i=$[$i+1]
done

DGPickett · February 26, 2013, 1:46pm

How do you get $! without & in the script runnning shell unless usearch is an alias?