Calling Bash Functions in the BG

I'm trying to call some functions in the background so that I can multitask in this script. Not working so hot. The functions I call don't ever seem to get called. I'm doing it the exact same way in another script and it's working like a champ so I'm very confused. Here's a pretty simple repro:

#!/bin/bash
 AT1="machine1"
AT2="machine2"
CM2=""
CM="machine4"
 check_background()
{
  # loop to check that background jobs finish
  WAIT_COUNT=0
  JOBS_COUNT="$(jobs -pr)"
  while [ "${JOBS_COUNT}" ]
  do
    let "WAIT_COUNT=${WAIT_COUNT}+1"
    if [ ${WAIT_COUNT} -gt 60 ]
    then
        echo "Waited 5 minutes for jobs to complete killing old jobs"
        kill $(jobs -pr)
        echo "Waiting 30 seconds to ensure validation is sucessful"
        sleep 30
    fi
    sleep 5
    JOBS_COUNT="$(jobs -pr)"
    NUMBER_JOBS=$(echo ${JOBS_COUNT} | wc -w | tr -d '[:space:]')
    echo -e "${NUMBER_JOBS} background jobs running, waiting 5 minutes (300 seconds) for them to complete"
    echo -e "Time Elapsed: $(( ${WAIT_COUNT} * 5 )) seconds\n"
  done
}
 app_background_stop ()
{
  SERVER=$1
  NUM_CK=$2
  echo "Server: ${SERVER} Num: ${NUM_CK}"
  sleep 30
}
app_stop ()
{
  #Stop all the at1 servers
  for at1_server in ${AT1}
  do
     app_background_stop ${AT1} 1 > /dev/null 2>&1 &
  done
  
  #Stop all the at2 servers
  for at2_server in ${AT2}
  do
      app_background_stop ${AT2} 2 > /dev/null 2>&1 &
  done
  
  #Stop all the backup cm servers
  for cm2_server in ${CM2}
  do
      app_background_stop ${CM2} 3 > /dev/null 2>&1 &
  done
  
  #Sleep a bit so the jobs have time to get started
  sleep 5
  
  check_background
  
  #Stop the primary CM
  app_background_stop ${CM} 4 > /dev/null 2>&1 &
}
 #Main routine
app_stop

And the results when I run it in debug:

$ bash -x ./test.sh
+ AT1=machine1
+ AT2=machine2
+ CM2=
+ CM=machine4
+ app_stop
+ for at1_server in '${AT1}'
+ for at2_server in '${AT2}'
+ app_background_stop machine1 1
+ sleep 5
+ app_background_stop machine2 2
+ check_background
+ WAIT_COUNT=0
++ jobs -pr
+ JOBS_COUNT='25429
25430'
+ '[' '25429
25430' ']'
+ let WAIT_COUNT=0+1
+ '[' 1 -gt 60 ']'
+ sleep 5
++ jobs -pr
+ JOBS_COUNT='25429
25430'
++ echo 25429 25430
++ wc -w
++ tr -d '[:space:]'
+ NUMBER_JOBS=2
+ echo -e '2 background jobs running, waiting 5 minutes (300 seconds) for them to complete'
2 background jobs running, waiting 5 minutes (300 seconds) for them to complete
+ echo -e 'Time Elapsed: 5 seconds\n'
Time Elapsed: 5 seconds
 + '[' '25429
25430' ']'
+ let WAIT_COUNT=1+1
+ '[' 2 -gt 60 ']'
+ sleep 5
++ jobs -pr
+ JOBS_COUNT='25429
25430'
++ echo 25429 25430
++ wc -w
++ tr -d '[:space:]'
+ NUMBER_JOBS=2
+ echo -e '2 background jobs running, waiting 5 minutes (300 seconds) for them to complete'
2 background jobs running, waiting 5 minutes (300 seconds) for them to complete
+ echo -e 'Time Elapsed: 10 seconds\n'
Time Elapsed: 10 seconds
 + '[' '25429
25430' ']'
+ let WAIT_COUNT=2+1
+ '[' 3 -gt 60 ']'
+ sleep 5
++ jobs -pr
+ JOBS_COUNT='25429
25430'
++ echo 25429 25430
++ wc -w
++ tr -d '[:space:]'
+ NUMBER_JOBS=2
+ echo -e '2 background jobs running, waiting 5 minutes (300 seconds) for them to complete'
2 background jobs running, waiting 5 minutes (300 seconds) for them to complete
+ echo -e 'Time Elapsed: 15 seconds\n'
Time Elapsed: 15 seconds
 + '[' '25429
25430' ']'
+ let WAIT_COUNT=3+1
+ '[' 4 -gt 60 ']'
+ sleep 5
++ jobs -pr
+ JOBS_COUNT='25429
25430'
++ echo 25429 25430
++ wc -w
++ tr -d '[:space:]'
+ NUMBER_JOBS=2
+ echo -e '2 background jobs running, waiting 5 minutes (300 seconds) for them to complete'
2 background jobs running, waiting 5 minutes (300 seconds) for them to complete
+ echo -e 'Time Elapsed: 20 seconds\n'
Time Elapsed: 20 seconds
 + '[' '25429
25430' ']'
+ let WAIT_COUNT=4+1
+ '[' 5 -gt 60 ']'
+ sleep 5
++ jobs -pr
+ JOBS_COUNT=
++ echo
++ wc -w
++ tr -d '[:space:]'
+ NUMBER_JOBS=0
+ echo -e '0 background jobs running, waiting 5 minutes (300 seconds) for them to complete'
0 background jobs running, waiting 5 minutes (300 seconds) for them to complete
+ echo -e 'Time Elapsed: 25 seconds\n'
Time Elapsed: 25 seconds
 + '[' '' ']'
+ app_background_stop machine4 4

Any ideas?

Thanks,
Eric

How about replacing this...

     app_background_stop ${AT1} 1 > /dev/null 2>&1 &

...with this for debugging:

      app_background_stop ${AT1} 1 > /tmp/${AT1} 2>&1 &

...and checking /tmp/${AT1} ?

And you can omit the file descriptors 0 and 1 - they are default. 0 for input redirection and 1 for output redirection.

I changed the code to:

#!/bin/bash
 AT1="machine1"
AT2="machine2"
CM2=""
CM="machine4"
 check_background()
{
  # loop to check that background jobs finish
  WAIT_COUNT=0
  JOBS_COUNT="$(jobs -pr)"
  while [ "${JOBS_COUNT}" ]
  do
    let "WAIT_COUNT=${WAIT_COUNT}+1"
    if [ ${WAIT_COUNT} -gt 60 ]
    then
        echo "Waited 5 minutes for jobs to complete killing old jobs"
        kill $(jobs -pr)
        echo "Waiting 30 seconds to ensure validation is sucessful"
        sleep 30
    fi
    sleep 5
    JOBS_COUNT="$(jobs -pr)"
    NUMBER_JOBS=$(echo ${JOBS_COUNT} | wc -w | tr -d '[:space:]')
    echo -e "${NUMBER_JOBS} background jobs running, waiting 5 minutes (300 seconds) for them to complete"
    echo -e "Time Elapsed: $(( ${WAIT_COUNT} * 5 )) seconds\n"
  done
}
 app_background_stop ()
{
  SERVER=$1
  NUM_CK=$2
  echo "Server: ${SERVER} Num: ${NUM_CK}"
  sleep 30
}
app_stop ()
{
  #Stop all the at1 servers
  for at1_server in ${AT1}
  do
     app_background_stop ${AT1} 1 > /tmp/test.out 2>&1 &
  done
  
  #Stop all the at2 servers
  for at2_server in ${AT2}
  do
      app_background_stop ${AT2} 2 > /tmp/test2.out 2>&1 &
  done
  
  #Stop all the backup cm servers
  for cm2_server in ${CM2}
  do
      app_background_stop ${CM2} 3 > /tmp/test3.out 2>&1 &
  done
  
  #Sleep a bit so the jobs have time to get started
  sleep 5
  
  check_background
  
  #Stop the primary CM
  app_background_stop ${CM} 4 > /tmp/test4.out 2>&1 &
}
 #Main routine
app_stop

The 3 files that got created had in them:

So it did go in there. Just didn't do what I expected it to do in the non-repro version of this script. Gotcha, thanks!

By "omit file descriptors 0 and 1", do you mean:

app_background_stop ${AT1} 1 > /tmp/${AT1} 2> &

Or just:

app_background_stop ${AT1} 1 2> &

(e.g. > is file descriptor 0 and &1 is file descriptor 1)?

Thanks for the help!

That's what I meant:

 app_background_stop ${AT1} >/tmp/${AT1} 2>&1 &

Sorry. Redirect target &1 can not be abbreviated to &.

Or another example:

command 0<input.txt 1>output.txt 2>&1

...is the same as...

command <input.txt >output.txt 2>&1

... which is equivalent to

command <input.txt &>output.txt

The & isn't for redirect, it's to background the function call so that I can do multiple function calls without waiting for each other. So what I'm trying to accomplish is:

my_function ${arg1} ${arg2} &

So that the script keeps moving forward while my_function is being run. Is that not happening here? Sorry if I'm being slow...

What you are doing is quite fine. As I've understood you got further with your problem. So the redirect thing is just additional information for a better basic understanding and not your main task.

These are different kind of things:

  • & run in background
  • >&2 redirect STDOUT to the same location as STDERR
  • &> redirect all open output file descriptors, to the target that's following

OK, thanks!