When I am writing my own interpreter...

I think you need to turn the list you have into the chain of processes, where each process is one element in the chain, then you can associate the arguments with the appropriate process and then when the list is all assembled have a piece of code that goes through the whole list forking/pipe'ing and exec'ing as required.

Of course the whole linked list of processes becomes one job.

Hmm... I seem to be doing something wrong. Here's what I've been doing...

I'm creating a filedescriptor fd[2]
fork child1
if(pid1 == 0) {
        close(fd[1]); 
        setting fd[0] to STDIN_FILENO using dup2
         close(fd[0]);
        }
        execlp("wc","wc",(char *) 0);
    }
    else {
        fork child 2
        if(pid2 == 0) {
               close(fd[1]);  
               set fd[0] to STDIN_FILENO using dup2
               close(fd[0]);
               set fd[1] to STDOUT_FILENO using dup2
               close(fd[1]);
               execlp("wc","wc", (char *) 0);
        }
        else {
            //parent continues
            fork child 3
            if(pid3 == 0) {
                    close(fd[0]);
                    set fd[1] to STDOUT_FILENO using dup2
                    close(fd[1])
                    execlp("ls","-la", (char *) 0);
            }
        }

What I'm trying to do here is creating a two level pipe for the command: ls -la | wc | wc

It works upto one level but after that I'm getting an "dup2: Bad File Descriptor" error maybe because of the child 2... I'm not sure what mistake I'm doing... Any suggestions on how to get around this one?

I was actually thiniking about one silly thing: when I'm creating multiple pipes, I need to create multiple file descriptors right? How would I do that? Just define a pool of unused file descriptors and use them only when I need them or is there any other way?

I tried defining using
int fd[4][2];

to get four pipes but it isn't working due to some obvious reason I guess... Any advice please?

I personally would associate the file descriptors with the struct that describes each process, so each struct may look like

struct shell_proc
{
   struct shell_proc *next;
   int fd_stdin,fd_stdout,fd_stderr;
   int argc;
   char **argv;
   char **environ;
   pid_t pid;
   int exit_code;
};

This is a great attempt ! :slight_smile: :b:

If the file descriptors data type is defined as
int fd[4][2]; again you are imposing the limit on the number of processes that could be piped to each other.

( Or is that you are trying first with 'n' number of processes for piping where n <= 4 and then moving on to a more generic attempt )

##############

When replying to this thread I got this doubt, may be worth asking this along with this thread.

So basically pipe is a KDS - kernel data structure with its own memory capacity, therefore number of bytes that it can hold.

Take an example piping process like this

ls | wc -l

( assume ls runs over a million files and definitely has to take some 'n' number of seconds to complete that )

case 1 : Should wc process need to wait till lc flushes all the filenames to the ' | ' ?
case 2: Or as and when ' ls ' lists the filenames to the kernel data structure ' wc ' process would start using that information to count on the number of files.

What would be scenario if the information from process 1 ' ls ' overruns the capacity of the pipe data structure? Will the process 1 be signalled by the kernel to stop flushing data to pipe and at the same time process 2 signalled by the kernel to start using the data from the KDS so that process 1 can push in more information to the pipe data structure.

Is this how pumping information in to the pipe by process 1 and using the information by process 2 happens ?

Keep this thread alive, it is becoming more interesting :slight_smile:

pipes are implemented in the kernel and can be efficiently implemented as a ring buffer of say 512 to 4096 bytes. A write adds to the buffer, but if it attempts to write more, it blocks until the buffer has had some data read out. Similarly a read will block until more data is available.

Normally a pipe() is unidirectional but on some platforms a pipe() is actually implemented using socketpair(AF_UNIX,SOCK_STREAM,0);

It may not directly help you, but this is the document inventing the pipes, i just stumbled across it on the net: pipes invented

The best information about recursively parsing and techniques of lexical analysis is still the "Dragon book" (real Name: Compilers: Principles, Techniques, and Tools) by Aho, Sethi and Ullmann (depending on you level of you pre-education, but you come across like you should be able to understand it). The authors offer better advice in this regard than perhaps any of us could provide.

bakunin

Thanks I'll try to get that book... Actually I'm considering of rewriting a few parts because I see that I'm not able to achieve what I need. However I was curious about one thing:

int status;
	pid_t pid;
	pipe(pipe_a);
        pipe(pipe_b);

        if (!fork())
        {
                dup2(pipe_a[1], 1);
                closeall();
                execlp(argv[0], argv[0], argv[1], (char *)0);
        }
        else {
		if (!fork())
		{
		        dup2(pipe_a[0], 0);
		        dup2(pipe_b[1], 1);
		        closeall();
		        execlp(argv[3], argv[3], (char *)0);
		}
		else {
			if (!(pid=fork()))
			{
			        dup2(pipe_b[0], 0);
			        closeall();
			        execlp(argv[5], argv[5], (char *)0);
			}
			else {
				wait(NULL);
				return 1;
			}
		}
		
	}

argv[0] = "ls"
argv[1] = "-la"
argv[2] = "|"
argv[3] = "wc"
argv[4] = "|"
argv[5] = "wc"

When I'm running this, what surprised me is that it isn't displaying the output but when I type quit(which exits my shell), it is displaying the output and then exiting... Why isn't it going back I mean returning back?

Also when I'm using ps in my shell to see what are the processes running, ls is not there but there are two wc's.... and when I compiled the program without that wait(NULL), it still runs but when I run ps, it shows ls with a <defunct> besides it... It still shows the output when I type quit...

Presumably because the shell itself still has both ends of both pipes still open.

Try adding

close(pipe_a[0]);
close(pipe_a[1]);
close(pipe_b[0]);
close(pipe_b[1]);

prior to your wait.

When your shell exits, these file descriptors get closed, hence allowing EOF to be read. EOF will never be read while there is a write end open.

Oh... Sounds like that... Now, I'm getting the required output but it still has that defunct word besides it... Now all the three have that word....

Are you going to reap all of your children?

When a child dies (see SIGCHLD) you need to call wait/wait4/waitpid to get the exit code to stop the process being a zombie.

You have a number of processes that you did not do a wait for.

In synchronous programming you do

pid=fork();

if (!pid) { child stuff; _exit(1); }

waitpid(pid,....);

Well yeah.... I am using exit(1). The reason why I'm not using it here in this code is because I cannot use it in the first place... because when I'm using execlp it doesn't give me the control back....

But in your example all of the processes are children of the shell. You do the wait as the parent of the fork().

Hence the shell should still reap them.

As a test, put the following in the code for each child after the fork()...

fprintf(stderr,"I am %d, my parent is %d\n",getpid(),getppid());
fflush(stderr);

I guess I now understand why you asked me to use Structures porter Sir... I guess I've had my piece of cake.. I'll rewrite this thing... I've already thought of a good structure so I guess will implement that now... Guess experience counts in situations like these... :slight_smile: I'll update as soon as I finish writing that model...

That's what it's all about, there are no right and wrong answers (apart from ones that don't actually work) and many ways to skin a cat.

But there are often solutions that are better, simpler or more elegant. :slight_smile: