Help with trap and signals

Basherrr · August 14, 2013, 9:46am

I am having issues with trap not working inside a script. I am currently trying this on a Knoppix system V 5.1. What I would like to happen is when I press control c, a message gets echoed and the script is ended. For example:

#! /bin/bash
 
 
trap "echo CTRL c was pressed ; break" SIGINT SIGTERM 
 
 
while :
do
      :
done

When I press CTRL c in the above example, A message gets echoed and the loop is ended and that's the end of the script. Pretty straight forward; however, when I put a "sleep 5" in the while loop and run the script, then press CTRL c, no message is echoed to the screen. Also the loop does not exit. Ex:

#! /bin/bash
 
 
   trap "echo CTRL c was pressed ; break" SIGINT SIGTERM
 
   sleep 5
 
 
 
  while :
  do
        echo in loop now
        sleep 5
 
 done

***NOTE*** If you try this above code you will create an infinite loop and CTRL c. will no longer work.

Once I add in the sleep command in the loop, CTRL C no longer prints a message and no longer breaks out of the loop. Does anyone know how to get around this and why the trap stops working when a sleep command is used in the loop.

I was thinking it had something to do with the trap statements having to wait till the sleep 5 to finish running and then the trap handler gets executed, but the trap is never getting ran. I am guessing it has to do with something else. Thanks in advance for your help.

jim_mcnamara · August 14, 2013, 10:13am

sleep invokes a wait like

nanosleep()

and sets a signal handler for SIGALRM.

But.

/usr/bin/sleep

is run in a separate child process. So, ctrl/c is sent to the child process not the parent.

It is analagous to running the

ls

command on a huge directory. ctrl/c stops the child

/usr/bin/ls

executable image before it dumps 10 zillion files to the screen, not the the parent process.

Lose the sleep command. It looks to me like you are trying out bash coding. ctrl/c only "works" on the parent when the parent is running an actual bash command like some kind of loop. Otherwise repeated ctrl/c keystrokes are required, so that the signal get delivered during the execution of a susceptible piece of code.

Basherrr · August 14, 2013, 11:09am

Okay, sleep is ran in a child process. That makes sense, but when I press CTRL c for the child process, why does the child process not echo the message to the screen?

trap "echo CTRL c was pressed ; break" SIGINT SIGTERM

Isn't the trap modification getting sent to the child process as well or does this modification only get applied to the parent process?

Is there a way to run sleep in the parent shell and not a child shell, so my trap modification will work?

I am not familiar with

nanosleep()

Is it pretty much the same as sleep, except in nano-seconds instead of seconds?

you said that it sets up a signal handler for SIGALRM. If I add in SIGALRM into the trap command will this resolve the issue? ex:

trap "echo CTRL c was pressed ; break" SIGINT SIGTERM SIGALRM

Then again, will this just get applied to the parent shell and not the child shell?

Please be thorough in your response, so I can understand what's going on. Thanks

Basherrr · August 15, 2013, 5:24pm

Also Is there a way to tell what commands, like sleep, are being ran as a child process? Is there a way to make these commands to be ran in a parent process?

Chubler_XL · August 15, 2013, 6:17pm

All commands unless they are bash internals are run as a child process.

As an example try replacing the external sleep command with the internal read like this:

while :
  do
        echo in loop now
        read -s -N 1024 -t 5
 
 done

alister · August 15, 2013, 8:59pm

I ran your script with bash and dash and both worked fine.

Control-C sends SIGINT to every process in the foreground process group. Non-interactive shells run everything in the same process group, so when that process group is the foreground process group, Control-C sends a SIGINT to that shell, to all of its children, to all of its children's children, etc.

This suggests to me that you are executing that code in an interactive environment, where each command runs in its own process group. This would put sleep in a different process group from its parent shell. Since sleep is then the only member of the foreground process group, it is the only process to be sent a signal.

Are you sourcing the script at a command prompt with . or source ? If yes, that explains the behavior. sleep is in a process group separate from the shell which invoked it, so the shell is not sent the signal.

If you are not sourcing the script, be specific and tell us exactly what you're doing. Also, while the script is running, collect the pid, ppid, pgid, and stat information for the relevant commands and share it with us. For this, the following command will probably work on your system.

ps -o pid,ppid,pgid,stat,args -t /dev/pts/4

Change /dev/pts/4 to the actual terminal you're using (this can be determined by typing tty at its prompt).

Regards,
Alister

kshji · August 16, 2013, 12:58am

Her eis my test result script and result. 3 methods to handle int in while loop.

repeat=1

realexit()
{
   echo " - int EXIT has done"
}

ctrlc()
{
   echo " - int INT (ctrl-C)"
   #exit 1   # stop the script
   repeat=0
}

####MAIN######

trap 'realexit' EXIT

# different methods to interrupt while using CTRL-C
# all works in ksh93 and bash, last version not in dash
trap 'ctrlc' INT
#trap 'echo " CCC " ; repeat=0 ' INT
#trap 'echo " Ctrl-C Break " ; break ' INT   # not while in dash, break the script

trap ':' HUP QUIT # nop - do nothing

while :
do
  repeat=1
  while ((repeat==1))
  do
    date
    echo "proc:$$"
    sleep 30
  done
  echo "while end"
done

Basherrr · August 16, 2013, 11:52am

Hello Alister,

Thanks for your reply. You are correct I am sourcing the script with

I don't understand why that matters though. I thought when you source a script you are running the script in the same shell, so I figure this should work compared to running the script in a new shell.

I now know something is not correct because I tried running the script without sourcing and the trap works fine, but I am confused as to why doesn't work with sourcing. Could you please explain what is going on here because I am confused.

ps -o pid,ppid,pgid,stat,args -t /dev/pts/**

returns:

PID         PPID       PGID       STAT    COMMAND
25592    25589     25592     Ss        -bash
25679    25592     25679     S+        sleep 155

Does the PPID just list processes with a parent ( child processes)? Also could you explain what a process group is?

btw: I am running these commands using ssh ( putty terminal), if that matters. I am not sure if I am running a Non-interactive shells or interactive shell. I will have to find out more about this because I am not really sure what this means.

Thanks again for your help!

---------- Post updated at 11:52 AM ---------- Previous update was at 11:47 AM ----------

I did some research and found the following link to be somewhat helpful.

Process group - Wikipedia, the free encyclopedia

Basherrr · August 19, 2013, 9:45am

Is there a way to source this script and have the trap that I set up work as intended. I am trying to export variables to script that has a while loop with a sleep command in it. I know I can just pass the variables in a file, but I would prefer to export them through sourcing; however, I would also like to have a modified trap in this sourced script. Is there anyway to do this? I am also still trying to figure out why doesn't the trap work when using sourcing, but works if the script is ran with non-sourcing.

kshji · August 19, 2013, 12:30pm

ksh93 can use $LINENO. This sort test script show it.

PRG=$0
PID=$$

# Ctrl-C test
trap 'echo $PID - $PRG - line:$LINENO;exit 2' INT

while true
do
        echo "my pid is $PID"
        sleep 5
        echo "."
        sleep 5
        echo ".."
        sleep 5
done

alister · August 19, 2013, 1:40pm

Yes. Use a non-interactive shell to source the script. Or, if you must use an interactive shell, invoke set +m (but this is unusual and may be a sign that you're going about things in the wrong way).

The issue has absolutely nothing whatsoever to do with sourcing versus not-sourcing. The issue is one of running an interactive shell with job control enabled versus a non-interactive shell without job control. You can source from either, you just happen to be sourcing from an interactive shell.

How does the system decide if a shell is interactive? An interactive shell is any shell that's invoked with the -i option, or that is not given any script to execute and has stdin and stderr pointing to a terminal (presumably, there's a human at the helm). Otherwise, the shell is considered non-interactive.

Interactive: bash
Non-interactive: bash my-script.sh

When a shell is interactive, each pipeline is considered an individual job. A human (presumably) interacting with the shell can suspend and resume these jobs, and put foreground jobs in the background and bring background jobs to the foreground. But how does the interactive shell implement such job control?

Imagine, for example, that you wanted to sort a file and then write the first 10 lines of the result to a file. You might run the following pipeline:

sort datafile | head > output

What if you then decide to terminate that job? You type Control-C (^C), which works by sending the foreground job the SIGINT signal. To properly terminate a job requires terminating each and every process that is part of that job. This is facilitated by putting all of those processes in the same process group and sending the signal to every member of that process group.

A process group id (pgid) is nothing but an integer in a process-related data structure, just like the process id (pid) and the parent's process id (ppid). A process group is a set of processes that share a common pgid value.

If there are many process groups on a system, how does the terminal know which process group to signal? The terminal itself is associated with a process group id, and this pgid is the terminal's foreground process group. In an interactive shell, when you run a command (or pipeline), it is run in its own process group and that process group becomes the terminal's foreground process group. You can change the foreground process group using job control (^Z, bg, fg).

In the ps output you posted, the + in the STAT column indicates that a process is a member of the foreground process group. Only these processes will receive keyboard-generated signals.

When using a non-interactive shell, each individual pipeline in a script is no longer considered a distinct job (unlike when they are entered at a command prompt). This makes sense since the commands would not have been grouped into a script if they were not part of a single task. Since the script itself is considered the only job, job control in non-interactive shells is seldom needed and is therefore disabled by default.

With job control disabled, the non-interactive shell does not create any process groups. Every command run by that shell inherits the shell's process group. This allows manipulating every process of every pipeline that the script runs by signaling a single process group. If this where not the case, if the non-interactive shell created a process group for each command/pipeline, using ^C to kill a script would be impossible because only the current command would receive the signal. As soon as that command exits, the next runs. The parent shell would be unaffected (sound familiar?).

Process groups also play a role in whether an individual process is allowed to interact with the terminal. Chaos would ensue if any process could unexpectedly take over and consume characters from a terminal. To prevent this, only processes in the foreground process group are allowed to read from the terminal (sometimes, the same is true of writing).

If you're still a bit lost, you're not alone. In my experience, most people do not have a solid grasp on how process groups, jobs, signals, and terminals are used to implement the interactive environment. The only way to get a handle on it is to research and experiment.

Your shell's man page covers its job control features. For a deeper dive, refer to POSIX - General Terminal Interface and glibc job control.

Regards,
Alister

Basherrr · August 20, 2013, 9:42am

Thanks, Alister!

That really clears some things up! Thanks for taking your time to write that response up. I found it very helpful and I am pretty sure other people will also find it helpful as well.