Doubt about pipes and subprocess

royalibrahim · January 19, 2010, 10:58pm

Hi,

I am having a trivial doubt. Please see the below pipeline code sequence.

command1 | (command 2; commend 3)

I am aware that the command that follows pipe will run in the sub shell by the Unix kernel. But how about here? Since these set of commands are grouped under "parantheses", will they run inside another sub shell of pipe's shell? Hope my question is not so be-wilder to be answered

royalibrahim · January 21, 2010, 1:29pm

I would like to know, whether I have asked a sensible question because I have not got any answer for it.

Corona688 · January 21, 2010, 2:16pm

We're not ignoring you if we haven't answered the instant you want an answer. You don't have to bump threads. In fact you shouldn't, it's against the rules you read and agreed to when you registered.

I don't think you've got that quite right. Without complex structures like brackets or loops, a command after a pipe needs no subshell. The shell just creates a new process, arranges the files how you wanted, and replaces the new process with the requested command.

A subshell does have to stick around to manage situations like you illustrated:

command1 | ( command2 ; command3 )

It has to stay resident in order to wait for command2 to finish before starting command3.

A subshell also exists in a situation like this:

command1 | while read LINE
do
        echo "${LINE}"
done

The subshell has to stay resident to process code after the pipe inside the loop.

These subshells have side-effects. Try this:

VARIABLE="hello"
echo asdf | ( cat ; VARIABLE="goodbye" )

echo "VARIABLE=${VARIABLE}"

It will output "hello", since variables changed inside the subshell aren't changed in its parent.

Andre_Merzky · January 21, 2010, 2:58pm

Hope this helps:

bash-3.2$ cat getppid.c 

#include <stdio.h>
#include <unistd.h>

int main ()
{
  fprintf (stdout, "%d\n", getppid ());
  return 0;
}

bash-3.2$ make getppid
cc     getppid.c   -o getppid

bash-3.2$ ./getppid >> t |  ( ./getppid >> t ; ./getppid >> t ) ; cat t
14236
14774
14774

I assume you talk about bash - tcsh is different here:

tcsh $  ( ./getppid ; ./getppid )
14867
12872

royalibrahim · January 21, 2010, 2:58pm

Thank you Corona688 for your reply.

But, anything to the right of a pipe will run in a subshell, not just loops. In non POSIX shells, the pipe spawns the sub shell. The use of a single pipe in a shell creates TWO subshells, one for each side (a pipeline with two | creates three subshells, for the three commands, etc.)

So my question is, how many subshells would've spawned in this scenario.

Also, in

 (command &)

The background job detaches from the current shell and runs in a subshell (separate environment) and also, we are encapsulating this command inside parantheses, hence we are forcing it to run inside a subshell again. So would it create 2 subshells here? one for parantheses and one for "background"??

Corona688 · January 21, 2010, 3:42pm

$ sleep 9000 | cat &
[1] 7557
$ ps
  PID TTY          TIME CMD
 7543 pts/3    00:00:00 bash
 7556 pts/3    00:00:00 sleep
 7557 pts/3    00:00:00 cat
 7559 pts/3    00:00:00 ps

There is no subshell.

$ sleep 9000 | cat | cat | cat | cat | cat &
[1] 7581
$ ps
  PID TTY          TIME CMD
 7543 pts/3    00:00:00 bash
 7576 pts/3    00:00:00 sleep
 7577 pts/3    00:00:00 cat
 7578 pts/3    00:00:00 cat
 7579 pts/3    00:00:00 cat
 7580 pts/3    00:00:00 cat
 7581 pts/3    00:00:00 cat
 7584 pts/3    00:00:00 ps

There's not 5 subshells either.

I think you're under a misapprehension here. The shell does clone itself in order to do redirection through pipes, but unless its actually needed, the subshell does not stick around: Once it's redirected file descriptors the way you asked it to, it replaces itself with the program you asked it to run. By the very act of running the process you asked it to, the subshell is wiped out. In its place is a brand new process with all the same redirections as the subshell used to have.

Not "also". A subshell remains because you encapsulated it with brackets. If you had not, no subshell would remain -- the subshell would exist for less than eyeblink, connecting file descriptors as specified then replacing itself with the new command. Why bother waiting around when there's nothing left for it to do?

...I'm beginning to have a sneaking suspicion that when you say "subshell" you mean "process". Not all processes are shells.

jim_mcnamara · January 21, 2010, 4:33pm

Expanding a bit on the good explanations above:

When you ask the shell to run the "cat" command the shell calls one of the exec functions:

The result is the cat executable is then running in the new process, not the shell.
When the process ends, the shell resumes in the parent process which waited for the cat process to end.