Controlling child processes

forumGuy · March 25, 2004, 12:15pm

Hello all, I am trying to create n child processes and control them from a parent process; say make child 3 print its pid and then child 5 do the same and some other stuff. Is there a way to accomplishing this after all the child processes are created via a call to fork().
Thank you,
FG

Driver · March 25, 2004, 12:45pm

> Is there a way to accomplishing this after all the child processes are created via a call to fork().

No. Unless child and parent explicitly define a protocol to control such things on a voluntary basis, e.g. the parent to could send a certain signal to a child in order to have it react in a pre-defined manner.

Given that all your target processes are children, the parent could create a pipe with each of them and send a signal whenever you've just sent a command down the pipe. The children would check the state of the pipe whenever the signal arrives and take appropriate actions. Of course, this might not be the best way depending on your requirements (apparently you don't seem to know what you want either), but it's a pretty asynchronous one.

Other than that, you could in theory kludge your own machine code into the child's text segment e.g. by using ptrace() or /proc/<pid>/mem or somesuch, but this is hardly applicable, so I'd guess you want to do the former.

forumGuy · March 25, 2004, 3:05pm

I gave it some thought, and based on your reply I would like to do the following: "the parent could create a pipe with each of them and send a signal whenever you've just sent a command down the pipe"
-- I have to guarantee that when I send the command down the pipe and then the signal, this is all done atomically, I would also like to use semaphores to guard this section of the code, I do not want to travel the disabling interrupts route or busy waiting.
If this possible can someone point me to some sample code or resources for all tasks, creation of pipes and writing commands to it, signaling the process and using semaphores.
Thank you.
FI

Driver · March 25, 2004, 3:54pm

> I have to guarantee that when I send the
> command down the pipe and then the signal,
> this is all done atomically

I assume that by ``atomically'', you mean that every receipt of a signal indicates the availability of exactly one new command and that no signals can be lost.

First of all, you should be familair with signal handling in general. If you are not, you should read the sigaction(2) manual help page to begin with. Another question is what your children are doing while not serving commands of the parent. Because of the very limited things you can do within a signal handler (usually, you should not do more than set a flag), the way you handle requests will depend greatly upon this.

If they are not doing anything, you could simply call pause(), install a signal handler which does not restart interrupted system calls (sa_flags = 0 with sigaction()) and handle everything after pause() returns with errno = EINTR. Otherwise, you would have to integrate some kind of flag into your child's main loop which is checked on a regular basis and which is set by the signal handler.

> I would also like to use semaphores to guard
> this section of the code, I do not want to
> travel the disabling interrupts route or busy waiting.

The signal being handled can be blocked while your signal handler is executing, but this could lead to signal loss if the parent sends more than one signal before the child gets to handle it. If your target operating system(s) support(s) the POSIX realtime extension function sigqueue(), you could use that instead of bothering with semaphores to avoid loss (alas, the various BSD's do not have this function, Linux and UNIX(R) branded systems do, however).

Of course write()'s of less than 512 bytes blocks by the parent are guaranteed to take place atomically already, so you could equally well ignore lost signals and read all commands there are whenever you receive a signal.

> If this possible can someone point me to
> some sample code or resources for all
> tasks, creation of pipes and writing
> commands to it, signaling the process and
> using semaphores.

This page seems like a good overview:

http://www.cs.cf.ac.uk/Dave/C/CE.html

I also recommend the excellent book ``Advanced Programming in the UNIX Environment'' by Richard Stevens.

forumGuy · March 25, 2004, 4:20pm

Thank you for all your suggestions and resources, just some insight into the problem I am trying to solve, I have to create n process and print the process number to a file m times; this is a sample:
% a.out 5 3 temp

should result in temp having the following contents:
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5

Can you think of another solution other than the possibilities discussed.
Thank you,
FG

Driver · March 25, 2004, 4:36pm

In other words, the children are doing nothing while waiting for the parent's notification. I agree that you will need a semaphore then, because the order in which signals are handled cannot be enforced in a reliable manner without resorting to changing scheduling policies, which is definitely not justified in this case.

I would probably do this:
Have each child install a signal handler for SIGUSR1, without restarting interrupted system calls. Have a loop calling pause() for the desired number of times the PID shall be written. After the parent has signaled the current child, it goes to sleep on a semaphore. After the child has written its PID to the file, it awakes the parent by incrementing the semaphore and calls pause() again. The parent goes on to signal the next child, and so on. Good luck

forumGuy · March 25, 2004, 4:57pm

Thank you for all your help, I will be coding this weekend.

DreamWarrior · March 29, 2004, 12:28pm

You could also try a message queue or domain socket. While these are a bit more complex than a pipe, they can be done bi-directionally and therefore can make this "do this; ok done" protocol a bit more simple than semaphores and signals.

In my opion, an easier more natural way of doing this is to use the select statement to wait on the commands to come in from the pipe and then communicate back to the parent (who also uses select) when the command has been completed. This is, in my opinion, a much better and predictable way of coding this. Plus, it has the advantage of not being capable of "losing" signals because of the fact that the pipe will continue to be readable until all commands are read off it.

For simplicity, you could use a single pipe to each child, have the parent write the command down the pipe, then the child (blocking in a select) would read it and signal back to the parent (either via another pipe, or with a bidirectional IPC mechanism such as a domain socket).

The simplest way I'd attack this is as follows:

Create a pipe which will be used to talk back to the parent. We'll call this pipe "U" (for upstream to the parent)

Create a pipe to each of the n children and fork them. (Remember to close the write end in the child code after the fork). We'll call this pipe "D" (for downstream from the parent)

Close the read end of the U in each child. Then go into an infinite "select" statement loop waiting for that childs' "D" to become readable.

Close the read end of EACH D you created in the parent.

While all that closing isn't really needed, it is recommended as a real application wouldn't want to leave dangling references to file descriptors it doesn't need.

At this point you can do the follow:

In the parent:
Send a command to the appropritate "D" pipe from the server.
Enter a select loop waiting for the child to send the "done" response back on "U"
Read the response.
Restart this process for the next child.

In the child:
Read the command, handle it, then write "done" to the parent's "U".

And that's pretty much it.

At the end you may want the parent to send "termination" commands down each "D" to kill off the children, or just use the kill command to do it explicitely. You'll also probably want to investigate the SIGCHLD signal as you may want to set it to be ignored to keep from getting zombie (or defunct) processes.

Anyway, just another idea. Personally, I don't like using signals to alert a process to do something given their potential for loss and the fact that their reception will intercept your code's stack and therefore the work they can do is very limited.

forumGuy · March 29, 2004, 1:52pm

When you write a value to a pipe, when is the value removed? Also does anyone have same code of creating pipes and assigning them to child processes?
Thanks,
FG

Driver · March 29, 2004, 3:04pm

This is getting slightly offtopic, but what the heck, there are not many posts in this forum anyway...

> In my opion, an easier more natural way [...]

One could also argue that signals would be more natural, because the problem solved by forumGuy does not require a means to carry data; It's only the notification that is desired, since there are no commands other than ``write PID''. Signal loss need not be a problem either, for the reasons I mentioned earlier.
But as you said, this is a matter of personal taste and preference.

> their reception will intercept your code's stack and therefore the work they can do is very limited.

Actually, it's not the stack that limits the usability of signal handlers. Unless you specify your own stack for signal handling (using the sigaltstack() function), its use of memory will be much the same as that of a simple nested function call, including the usual automatic expansion of stack space if its current end is reached.

The real problem is that static data might be corrupted if a signal is posted while the normal path of execution uses and relies on static data that will also be accessed by the signal handler. If write access of variables also reda or written by signal handlers is not protected by blocking signals (analogous to the way you have to block interrupts in the top half of a device driver so it does not interfere with updates from an interrupt handler), interolerable race conditions will result (there are also tolerable race condition, but that's another story).

All variables have to be volatile-qualified to begin with. If the normal path of execution is in the process of updating a shared resource, the signal handler might find it in an incosistent state. This is, however, a solvable problem, as I said: Block signals as needed and you can write away.

Unsolvable problems arise when the standard C library comes into play (this is the real reason why signal handlers are not very powerful). A function you intend to call from within a signal handler must be guaranteed to be reentrant, because otherwise a code path running this function might be using static data. If a signal calls the same function, it will invariably corrupt this static data and your program will break.

Even if the function happens to be unused at a certain point, the behavior of calling it from within a signal handler is still undefined and should not be relied on. The POSIX list of functions that are guaranteed to be called safely from a signal handler is very short. To begin with, it excludes ALL Pthreads-related functions, so you cannot e.g. signal a condition variable and unlock a mutex from within a signal handler.
-------------snip--------------

> When you write a value to a pipe, when is the value removed?

As soon as a reader of the pipe read()'s the data.

> Also does anyone have same code of creating pipes and assigning them to child processes?

int     fds[2];
pid_t   pid;
if (pipe(fds) == -1) {
        /* Handle error */
}

if ((pid = fork()) == -1) {
        /* Handle error */
} else if (pid == 0) {
        char    buf[128];
        int     rc;
        close(fds[1]);
        if ((rc = read(fds[0], buf, sizeof buf - 1)) == -1) {
                /* Error */
        } else if (rc == 0) {
                /* Other end closed by parent */
        } else {
                buf[rc] = 0;
                puts(buf);
        }
        close(fds[0]);
} else {
        close(fds[0]);
        if (write(fds[1], "hello world", sizeof "hello world") == -1) {
                /* Error */
        }
        close(fds[1]);
}

I didn't test this, but something along the lines should work... Again, I refer you to the pipe() and read() and write() manual help pages in case you need more information.

forumGuy · March 30, 2004, 4:13pm

Thank you for the sample code, I was discussing pipes with one of my colleagues and it is seems that when a read is being performed on a pipe the OS performs a blocking operation or a blocking operation takes place and because of this, all the processes can be coordinated amongst themselves to perform the tasks that were previously outlined. Can someone shed some light on this if possible.
Thanks,
FG

Driver · March 30, 2004, 4:29pm

You cannot share a single pipe among all children and the parent in this case, because it's impossible to direct the data to a particular child, so the reader of next message will be picked more or less at random out of the bunch of children blocked on the pipe.
You would have to have one pipe per child.

Each child can block on its pipe and wait for a message by the parent. The problem here is that a pipe is generally uni-directional, i.e. both descriptors for I/O permit only input or output, but not both. Some pipe implementations do support bi-directional transfers, but this feature is not specified by the various POSIX and UNIX standards and should not be relied on.

It follows that you will need an additional means to notify the parent of the completition of command interpretation and execution, such as an additional pipe, or a semaphore, or a signal handler, or a message queue, or ...

forumGuy · March 30, 2004, 4:47pm

My mistake, I forgot to mention the most important thing, the scheme has to change, it cannot be a master slave relationship. So once the children are created the parent either waits or exits without doing anything else. The children have to coordinate and achieve the following output:
% a.out 5 3 temp

should result in temp having the following contents:
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
So if I can set up a pipe from P1 (Process1) to P2 form P2 to P3 ...Pn-1 to Pn and then another pipe for each process (each process has two pipes) to the output file, is there a way to coordinate this? Also when you say "Each child can block on its pipe" can you expand a bit, maybe an example.
Thanks,
FG

Driver · March 30, 2004, 5:00pm

> My mistake, I forgot to mention the most
> important thing, the scheme has to change,
> it cannot be a master slave relationship.

Why not?

> So if I can set up a pipe from P1 (Process1)
> to P2 form P2 to P3 ...Pn-1 to Pn and then
> another pipe for each process (each process
> has two pipes) to the output file, is there a
> way to coordinate this?

Two processes can only communicate through a pipe if it was created by a common parent (or one of them is the parent itself). The pipes have to be set up by the parent, and you must also make them aware of their position in this line of pipes, for example by writing it to a counter before performing a fork().

Also, if you choose to use pipes after all, you don't have to communicate the results back again. The final child in the line could write to the first pipe so you get a circle.

> Also when you say "Each child can block on
> its pipe" can you expand a bit

What's to expand? I repeated what you said in your post: That the read will block if there is no data.

I think it's time for you to research the rest of your solution on your own now.

forumGuy · March 30, 2004, 5:07pm

Thank you for the info and all your help.
FG

DreamWarrior · April 19, 2004, 12:22pm

That's what I meant, you put it better. I meant that because it can intercept the stack at any time the code running just prior is unknown and therefore the state of the application's data is generally difficult to deduce.

Anyway, as I stated, and you agreed, it is really just personal preference which method is to be used. I dislike signals for such things as this. Besides, which signal do you use, user1/user2 there are no "sync" or "comm" signals. To me it just seems like a bad approach to use signals. But you're right, technically there is no "data" to necessitate a pipe. However one could argue that a signal is just another form a data, just transmitted through a different, and in my opinion less reliable, manner.

Well, for what its worth, don't take this as argumentative because its not meant to be. Debating personal preference is, however, sometimes worthless...so I guess I'll shut up now :D.

forumGuy · April 19, 2004, 1:05pm

Thank you all for the help, decided to go with a shared memory solution:

#include <stdio.h>
#include <unistd.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <fcntl.h>

main(int argc, char* argv[]) {

	int res; /* child id */
	int i, j, m, n;
	int shm_id, shm_value, child_num;
	int fd;
	int write_value, num_char;
	int  *  shm_ptr;
	char buf [sizeof(int)];
	 
	if( argc!= 4 || ( (n=atoi(argv[1])) <= 0 ) || ( (m=atoi(argv[2]))  <= 0) ) {
        fprintf(stderr,"Usage: %s <n - number of processes> <m - times process prints> <file - name of file>\n", argv[0]);
        fflush(stderr);
		exit(1);
    }

	/*Open file for writing*/
	if( (fd = open(argv[3], O_RDWR | O_CREAT, 0755 )) == -1 ){
		fprintf(stdout, "Cannot open file name = %s\n", argv[3]);
		fflush(stdout);
		exit(0);
	}
	/* Init Shared memory */
	shm_id = shmget(IPC_PRIVATE, sizeof(int), 0666 | IPC_CREAT);
	
	/* Attach this process to shared memory buffer */
	if( (shm_ptr = shmat(shm_id, 0, SHM_W)) == (int *) -1 ){
		fprintf(stdout, "Cannot attach process to memory \n");
		fflush(stdout);
		exit(0);
	}
	*shm_ptr = 0;
	for(i = 0; i < n; i++) {
		 if( (res = fork()) < 0 ){
			fprintf(stderr, "Cannot create child %d \n",(i+1));
			fflush(stderr);
			exit(0);
		}

		 if(res == 0){
			/* attach shared memory to this child process */
			if( (shm_ptr = shm_ptr = shmat(shm_id, 0, SHM_R | SHM_W)) == (int *) -1 ){
				fprintf(stdout, "Cannot attach process to memory \n");
				fflush(stdout);
				exit(0);
			}
			child_num = i;
			for( j = 0 ; j < m ; j++ ){
				/* Read the value of the shared memory */
				shm_value = *shm_ptr;
				/*Keep reading from shared memory until the previous process has written value*/
				while (1){
					if( shm_value == child_num){
						break;
					}
					shm_value = *shm_ptr;
				}

				/*Write next child number into shared memory buffer */
				write_value = (shm_value + 1);
				num_char = sprintf(buf, "%d", write_value);
				write(fd, buf, sizeof(int));
				write(fd, "\n", 1);
				*shm_ptr = ((shm_value + 1)%n);
			}
			break;
		 }
		 
	}
	/* Remove shared memory */
	shmctl(shm_id, IPC_RMID, 0);
	close(fd);
	exit(0);
 }

Code tags added for readability -- Perderabo

DreamWarrior · April 19, 2004, 2:29pm

your code would be more easily readable with tabs :D, to get them use the

code here

this will look like this:

code here

Anyway, it seems as though you are entering an infinite while loop "polling" the shared memory for a change in the child. Then you are writing back to the shared memory in said child for another to pick it up. This is all being done without any semaphore protection and therefore it is possible that things could go awry.

At any rate, this polling makes inefficient use of the CPU cycles, IMO. While it gets your assignment done...I just wanted to point out that its probably not an effective way of doing it.

However, it looks like you got it done...congrats.

edit: looks like the child does the shared memory write...this is worse yet.

forumGuy · April 19, 2004, 2:44pm

Thank you for the feedback,will take it into consideration.
"Because the child is "only reading" the data the semaphore is potentially unneeded, but theoretically a "partial write" by the server could be misinterpreted by the child"
-- Would the partial write matter if you are looking for a specific value?
Just trying to understand this stuff.
Thanks,
FI

DreamWarrior · April 19, 2004, 4:02pm

Theoretically, yes...in practice, you'll probably never see it happen. However when, if, it does you'll be banging your head trying to figure out what went wrong if you don't have proper synchronization. In fact, in multi-process (multi-threaded) applications this is one of the single biggest coding isses, IMO. (More accurately trying to tune said access to a point where it is both reliable and fast).

Look at it like this:

You have multiple processing accessing memory, and therefore there exists the potential for the Kernel to "context switch" in the middle of an instruction. Now, depending on how the read or write is done depends on whether you see a problem.

Lets say your platform uses 32 bit integers. At the very least, on a hardware (CPU instruction) level, you can guarentee the atomic (i.e. single non-context-switchable operation) read/write of only 8 bits of data (more in most cases, but the minimum register size of any CPU used today is probably going to be 8 bits). The CPU intruction set, kernel, compiler, etc all come into play when knowing when gets done when read/writing an integer to/from said shared memory (or in fact any memory). If your architechture can guarentee the atomic read/write of said 32 bits then you're fine. You'll never have a problem because it would be impossible for the application to be switched during the read/write of an integer.

HOWEVER, if it can not, then application 1 could be in the middle of a read or write when the kernel context switches in application 2 and then this incomplete operation is potentially an issue.

The point is, you really don't know what's going on below you and therefore the possibility does exist for this to get confused. This is exactly why semaphores exists to protect and synchronize this access. This way an application knows when its reads/writes can be "safely" done.

For example, let's say a certain platform can only modify 16 bits of data at a time. To tinker with a 32 bit area would require two (or more) operations. If the Kernel allowed a context switch between them, then it'd be possible (although highly unlikely) for the following sequence of events:

process 1: read first 16 bits
process 2: write first 16 bits
process 2: write second 16 bits
process 1: read second 16 bits

now process 1's view is distorted.

While in practice this is probably NEVER going to occur, it could and bugs like these are very hard to track down because reproducing this case would be almost impossible. Best just to code it correctly to insure that everything is ok.