Strange SIGINT propagation between Parent/Child sh scripts

Good day,

I am trying to add signal handling capabilities to some of my scripts. Unfortunately, I am having some difficulty with the manner in which signals are propagated between parent/child processes. Consider the following example:

I have the following "parent" script:

#!/usr/bin/sh

child_id=0

SIGINT_handler()
{
     echo "SIGINT caught in Parent"
     if [  $child_id  -ne 0 ]
         then
            echo Parent sendingi: kill -n 2 $child_id
            kill -n 2 $child_id
     fi
}


echo parent running
trap 'echo parent exiting; exit' 0
trap 'SIGINT_handler' 2                            # Pass Signal 2 to child but don't die
./child &
child_id=$!
echo 'child_id = ' $child_id

# Wait until child-process exits
wait $child_id
WAIT_STATUS=$?
echo Wait Status recorded when parent continues: $WAIT_STATUS

echo Parent Still Running after the child exits!
sleep 1000

which calls the following "child" script (as a background process):

#!/usr/bin/sh

echo child started. pid is $$
#trap 'echo child exiting; exit 0' 0
trap 'echo child got signal 2; exit 0' 2
sleep 1000

When I execute the "parent" process with a ">./parent",
I observe that the expected behavior. I then send a
SIGINT to the "parent" via a ">kill -s SIGINT <ppid>" and
get the following screen output:

When I look at the processes that are still active, I see
that the child (and its sleep) has not been killed at all -
all that happened was that the sleep-process of the parent
was activated.

I would appreciate it if anybody has some idea what is happening
here.

'When I execute the "parent" process with a ">./parent",' This does not work for me but . parent does run it. Similarly I changed './child &' to '. child &' and got the following output: -

TX5XN:/home/brad/wip/signals>. parent
parent running
child started. pid is 3698
[1] 7191
child_id = 7191

a ps -ef gave: -

brad 7191 3698 0 20:04 pts/0 00:00:00 ksh

CTRL C of parent produced this output: -

^Cchild got signal 2
Wait Status recorded when parent continues: 0
Parent Still Running after the child exits!

"child got signal 2" is the message from the child's signal handler, not the parent; at this point a ps shows that the child is gone.

A further CTRL C gives: -

^CSIGINT caught in Parent
Parent sending: kill -n 2 7191
kill: 7191: no such process
SIGINT caught in Parent
Parent sending: kill -n 2 7191
kill: 7191: no such process

Note that the handler is called twice.

When I run your code with the line './child &' in it I get the output: -

TX5XN:/home/brad/wip/signals>. parent
parent running
ksh: .: line 19: ./child: not found
[1] 7398
child_id = 7398
Wait Status recorded when parent continues: 127
Parent Still Running after the child exits!
^CSIGINT caught in Parent
Parent sending: kill -n 2 7398
kill: 7398: no such process
SIGINT caught in Parent
Parent sending: kill -n 2 7398
kill: 7398: no such process

Are you sure you actually ran the child process or am I missing something? I don't have much experience of signal handlers?

Hi steadyonabix,

Thanks for your quick reply! I think the "./" works when you make the scripts executable. What I did, was to make each file "parent" and "child" executable with chmod, i.e.

> chmod "+x" parent

and then the same for the child. Then the "./parent" and "./child &" commands should work.

Also, you are using the korn shell. I am using the bash, or sh. (I am not sure which one of the 2 - I am fairly new to Unix).

I have reformulated my original question in the form of 3 new small sh scripts to illustrate exactly what I am struggling with, w.r.t parent/child process behavior:

Consider a "parent" process:

#!/bin/sh

sleep 600 &
#./child1 &
PID1="$!"

sleep 600 &
#./child2 &
PID2="$!"

trap "kill $PID1 $PID2" exit  INT
wait

I made the script executable with ">chmod "+x" parent" and executed it in a bash environment, via ">./parent".

I then do a "ps -alf | grep dludick" to see my list of running processes, I observe the following output:

The parent process and the 2 sleep (children) are clearly running.

When I then execute a ">kill -s SIGINT 22421" all the processes are killed correctly. I verify this by doing a "ps -alf | grep dludick" and observe:

Now, when I change the parent script to:

#!/bin/sh

#sleep 600 &
./child1 &
PID1="$!"

#sleep 600 &
./child2 &
PID2="$!"

trap "kill $PID1 $PID2" exit  INT
wait

where the child1 and child2 scripts are as follow

child1:

#!/bin/sh
echo starting proc 1, pid=$$
sleep 600

and child2:

#!/bin/sh
echo starting proc 2, pid=$$
sleep 600

(Note that each for each of the above scripts you also need to do ">chmod "+x" child1/2"

When again executing the parent via ">./parent" followed by ">ps -alf | grep dludick" I get the following active process summary:

Illustrating the correct "parent/child" relationships.

Now, when I kill the parent with a SIGINT ("> kill -s SIGINT 22469") I see that the following processes remain active:

Does anybody know why the sleep processes are not being killed when each of the children receives a "SIGINT" signal? How can I modify the above scripts to ensure this?

That's because the sleep procs didn't get a kill signal. The parent got it. It then killed the kids. But the sleeps (which are kids of the children of parent) did not get a kill. So they continue to sleep. Note that the parent of the sleeps have changed from the original parents to proc id 1 (aka init). The parents died, but the sleeps are still sleeping...

Thank you for you quick reply, your advice worked.

I understand what you mean about the children of the children (i.e. the sleep processes) not getting the signal. I tried the following:

Changed "child1" to:

#!/bin/sh

echo starting proc 1, pid=$$
sleep 600 &
pid1=$!
trap "kill $pid1" exit INT
wait

and then "child2" to:

#!/bin/sh

echo starting proc 2, pid=$$
sleep 600 &
pid1=$!
trap "kill $pid1" exit INT
wait

so that the INT signals gets propagated to the spawned "sleep" children of each. When I now send a SIGINT to the original parent script, I see that the sleep processes are killed active. I think this solves the problem.