Command timeout from inside script.

gencon · February 1, 2012, 11:58am

Hi,

I've written a very robust script to get an external IP address from 'behind' a router. It uses many web pages randomly choosing which one/ones to use at run time. The "fetch the web page containing the IP address" is handled by either wget or curl both of which have their 'max time for the operation' option set to 1 second. Most of the web servers respond within a max of 500ms, so the idea of having the timeout set to 1 second is so that if a server is slow to respond the script just moves on to another server.

I've noticed that neither curl nor wget actually keep to the timeout. On my machine this was not a problem, the timeout always kicked in at the latest of about 2.5 seconds. Over the last 24 hours I've tested the script on a Debian machine, having the script run 10 times every 5 mins. Looking at the results I've discovered that on 14 occasions (of the 7560) the timeout failed catastrophically, ranging from 27.722 to 170.643 seconds - the latter is almost 3 mins from a 1 sec timeout.

It seems that I'll have to write my own timeout routine. There are quite a lot of examples out there but the all seem to rely on either using the program 'timeout' (which is not POSIX), or using a timeout script, Eg. timeout_script -t 1 command - which is not what I need at all.

Here's a little bit of my code:

# curl command line for curl users.
curlCommand="curl --silent --max-time $timeout"

# wget command line for wget users.
wgetCommand="wget --quiet --timeout=$timeout --tries=1 --output-document=-"

urlDownloaderProg=$curlCommand

grepExpr="[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+"

ipAdd=$($urlDownloaderProg "$url" | grep -Eo "$grepExpr" | uniq)

What I need to do is for the line beginning 'ipAdd=' to return immediately, and then for the script to monitor its child process and kill it if it is not finished after 1 second. But I'm not sure how to do this - in particular how do I get the ipAdd= line to return immediately, I had problems using '&' both inside and outside the final ')', while still storing the results of the line in the variable ipAdd. Maybe I am going about this the wrong way and need a different approach.

Any help or advise would be greatly appreciated. Many thanks.

Corona688 · February 1, 2012, 12:02pm

I don't think you can get it to return immediately inside backticks like that. Save its output to a temporary file instead, then use it.

urlDownloaderProg "$url" > /tmp/$$ &
PID=$! # PID of background command
sleep 1
# If curl has already quit, killing it won't change its exit status
# from 0 into 1.
kill "$PID" 2> /dev/null
if ! wait # curl returned error
then
        echo "timeout" >&2
        rm -f /tmp/$$
        exit 1
fi

ipAdd=$(grep -Eo "$grepExpr" /tmp/$$ | uniq)
rm -f /tmp/$$

This has a side-effect though; the minimum time will be one second, too. If you have Linux, you can do sub-second sleep times:

urlDownloaderProg "$url" > /tmp/$$ &
PID=$! # PID of background command

END=$((SECONDS+1))

while [[ -d /proc/$PID && "$SECONDS" -lt "$END" ]]
do
        sleep 0.1
done

# If curl has already quit, killing it won't change its exit status
# from 0 into 1.
kill "$PID" 2> /dev/null

if ! wait # curl returned error
then
        echo "timeout" >&2
        rm -f /tmp/$$
        exit 1
fi

ipAdd=$(grep -Eo "$grepExpr" /tmp/$$ | uniq)
rm -f /tmp/$$

gencon · February 1, 2012, 12:23pm

Thanks...

Well the script needs to be POSIX so no sub-second sleep times. Also having a minimum time of 1 second while acceptable is not ideal, often it's done in as little as 75 ms.

It would be fine though to split the 'ipAdd' command up, after all its the url download which needs to timeout, the grep and uniq part will be lightening quick.

ipAdd=$($urlDownloaderProg "$url") 

# Do this bit later:  grep -Eo "$grepExpr" | uniq

Would that help? Or is it the assignment within the $() notation which is the problem rather than the piping?

PS. Does anyone know why both curl and wget don't seem to have a reliable "max time for whole operation"?

PPS. I know neither curl nor wget are POSIX - but they are the only bit of the script that isn't, and I've got to access the web urls somehow.

Corona688 · February 2, 2012, 1:25pm

It makes no sense to put a background statement in backticks, is the problem. How could it possibly set the variable until the process completes? So it either it fills the variable with a blank and puts the program in the background, or waits anyway. Neither gets you what you want.

You could replace the sleep with :, which will cause it to just loop forever until either condition breaks the loop. This will cause 100% CPU usage while it's waiting for the timeout. If you run your script with 'nice ./script.sh' that may be tolerable.

Since it needs to be POSIX you'll have to replace my math statement too.

END=`expr $SECONDS + 1`

gencon · February 2, 2012, 6:12pm

Yes indeed absolutely right.

I've spent some more time on this today and have some code as a proof of concept (I know it needs error checking added) but can you guys spot any problems/errors/bad practice with the following working code?

It has no sleep required, accepts timeouts with fractions of a second, and seems to do the job.

One problem though, the "kill -9" statement always ends up outputting a line like this (below) in the shell even with the re-direct, where am I going wrong with that?

./test: line 56:  5968 Killed   $urlDownloaderProg "$url" > $tempFileName

#!/bin/bash

# Usually responds fast ( < 200 ms)
url="http://checkip.dyndns.org"

# Usually responds slow ( > 1000 ms)
# url="http://www.dnsstuff.com/"

urlDownloaderProg="curl --silent"

tempFileName=$(mktemp --quiet "temp.XXXXXX")

$urlDownloaderProg "$url" > $tempFileName &
pid=$!

timeoutStart=$(date +%s.%N)

timeout="1.0"

finished="No"
processFinished="No"

while [ "$finished" = "No" ]; do

    ps -e | grep --quiet "$pid"
    grepRetVal=$?

    if [ "$grepRetVal" -ne "0" ]; then
        finished="Yes"
        processFinished="Yes"
    fi
    
    timeoutNow=$(date +%s.%N)

    bcExp="if ($timeoutNow - $timeoutStart > $timeout) { print 1 } else { print 0 }"

    bcRes=$(echo "$bcExp" | bc -l)

    if [ "$bcRes" -eq "1" ]; then
        finished="Yes"
    fi

done

if [ "$processFinished" = "Yes" ]; then

    tempFileContents=$(cat $tempFileName)
    grepExpr="[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+"
    ipAdd=$(echo "$tempFileContents" | grep -Eo "$grepExpr" | uniq) 
    echo $ipAdd

else
    kill -9 "$pid" > /dev/null 2>&1
fi

if [ -f "$tempFileName" ]; then rm "$tempFileName"; fi

Many thanks all.

Corona688 · February 2, 2012, 6:58pm

Try kill without the -9.

gencon · February 3, 2012, 1:58pm

Ok, so this is really strange, at least I don't get it.

I created a new script with nothing in apart from starting a process using wget to download a 100 MB file, storing the pid and then killing it. It did not matter whether I used kill or kill -9 or whether I redirected to /dev/null 2>&1 or not. The shell did not show any line of output concerning the kill at all, but the kill succeeded. But back in my 'real' script it happens every time regardless of whether I use the -9 or redirect to /dev/null - every time I get a line like this:

./getipto: line 221: 11550 Killed $urlDownloaderProg "$urlToDownload" > $tempFileName

How could it make any difference? Inside an if statement? Inside a function? I just don't understand this behaviour at all.

Any ideas anyone?

Thanks.