When I am writing my own interpreter...

Legend986 · October 15, 2007, 6:05pm

While trying out my hand at writing an interpreter, I was wondering about a a few issues one of which is the following: When I run a command such as jobs in the shell, I get a list of all the background jobs that are running... But if I need my interpreter to run that command, how would I be doing it? If I'm not mistaken, jobs is an internal command... So what will I specify in path if I want to run it using the execv command?

porter · October 15, 2007, 6:18pm

job control is internal to the shell, not all shells support job control, for example ksh does, sh doesn't.

If you want job control, manage it yourself, as an internal command, internal list of jobs etc.

Legend986 · October 15, 2007, 6:54pm

Oh.. can you tell me where I should start to be able to do that? I mean what should I read to be able to write code for that...

porter · October 15, 2007, 7:08pm

How about you firstly:

decide what the requirements for job control are, what is it trying to achieve, how should it behave etc.
then decide how to implement.

mah · October 16, 2007, 7:18am

A shell is said to have job control if it allows users to put process
int the background and vis-a-vis.

Where to start ... these can be done

how to deal with session , process groups and controlling terminal
how to put process in the background.
signals handling

All the best.

Legend986 · October 17, 2007, 2:17am

Thanks a lot... I've thought of a few things:

I need to maintain a table that contains the job id and the job name.
I cannot put a process that is running currently through my interpreter into the background, without first invoking the Stop (CTRL+Z) command.
Once I invoke the stop command, I can issue a command bg <process_name> to put the task into the background and then the interpreter will issue a signal to the process to resume itself.
If I have to bring a process into the foreground, I need to write code in my interpreter in such a way that it first stops the process and then brings it to the foreground and then resumes it (or get it directly?) and then ceases the command of the shell from the user (invoke a wait command until the process is complete?).

I am little confused about jobs and processes - the difference between them. I hope someone can clarify if my above steps seem ok and also the difference between a job and a process...

blowtorch · October 17, 2007, 2:33am

The steps you want to follow look ok. The difference between jobs and processes is that a job is specific to the shell that you are running. For example, if you run a background job in a shell and check, say, the jobs command, then you get the jobs list starting from job #1. If you start another shell (inside this one), run another background job and run jobs again, you will get a list again starting from job #1.

This is not true of processes. A pid is a pid no matter which shell you look at it from.

So a job is specific to the shell, while a process runs at the system (OS) level.

Legend986 · October 17, 2007, 1:25pm

Thank you so much. Can I say that "nano" or "vi" is a process and a command like "ls | grep .c" is a job?

And while implementing pipes, when I issue a command such as "ls | grep .c" in my interpreter, after parsing it, what should I be doing? I read about file descriptors and am assuming the following has to be done:

Parse the command line
argv[0] contains ls, so fork a process and execute it but redirect the output to a file (I don't know how this can be done internally in the C Code. I know I need to use the execv command to execute but how will I redirect?)
In the next parse, I scan the "|" character and so I know that the user wants to pipe the output and at this stage I would fork another process with the argv[2] string i.e. "grep .c" (but this will be stored in argv[2] and argv[3]. How will I know that the second one has command line arguments too and how should I handle them?) and then direct the output of this to the stdout.

Please let me know if the above steps are ok...

blowtorch · October 17, 2007, 10:01pm

First of all, all the commands you specified are processes. nano will be one process, vi will be another. "ls | grep xyz" will create two processes, one for the ls and the second for the grep. A job is what the interpreter (the shell) handles internally. A job may consist of one or more processes. It is basically a one or more processes that the shell treats as a single unit when, say, handling signals.

As for your second question, shouldn't you ask your college instructor for help with your homework? But overall, you are on the right track. Choose your seperators well. White space (spaces and tabs) can come in the course of a single command. But a | or a newline (\n) are definitely command seperators. Treat them as such.

Legend986 · October 17, 2007, 10:06pm

Not exactly a homework(I'm a research student doing this in my free time so that I can get a grip on Unix) but doing it out of curiosity... And well, I am not asking for a solution here... I am writing down my thoughts and asking if its right or wrong... Anyways if you feel I shouldn't get any help, I respect your decision but please consider that I'm a beginner...

By the way, thanks for the explanation of a job and a process. That makes a lot of things clear...

blowtorch · October 18, 2007, 12:05am

I'm not saying that you shouldn't get any help at all, in fact, you are actually making an effort to do this yourself, while most people just ask for code.

Just that it is against forum rules to post homework questions.

porter · October 18, 2007, 12:17am

I'm looking forward to using "lsh" when Legend986 is done, it looks good so far.

Legend986 · October 18, 2007, 12:21am

Well I understand... But thats my point, I'm not in search of a solution or the code... I'm not a Computer Science student and I've made this point clear almost a few weeks back in the same forum and need some help in learning all this because I'm lost at a few places... Well, even then what you say is right... Even though this is not my homework, I now understood your reason - these "homeworks" are given in many universities... lol

I don't quite remember using the term lsh anywhere... Can you please tell me what it is? Or by any chance did you give a name to my shell?

porter · October 18, 2007, 12:22am

Legend986 shell.

Legend986 · October 18, 2007, 12:26am

Yeah sure... I might seem a little ambitious and I'm sure it'll take some time to build all that but I'll show it to you once I'm done with it... So do you have any suggestions to give me as far as my last post is concerned? I've just pasted the relevant matter here:

And while implementing pipes, when I issue a command such as "ls | grep .c" in my interpreter, after parsing it, what should I be doing? I read about file descriptors and am assuming the following has to be done:

Parse the command line
argv[0] contains ls, so fork a process and execute it but redirect the output to a file (I don't know how this can be done internally in the C Code. I know I need to use the execv command to execute but how will I redirect?)
In the next parse, I scan the "|" character and so I know that the user wants to pipe the output and at this stage I would fork another process with the argv[2] string i.e. "grep .c" (but this will be stored in argv[2] and argv[3]. How will I know that the second one has command line arguments too and how should I handle them?) and then direct the output of this to the stdout.

porter · October 18, 2007, 12:35am

You need to read in a logical line, so a "\" followed by a new line continues.

Then you need to do macro expansion, so look for all "$" and expand as per environment variable.

I would then look for all commands between back quotes, run a subshell to execute the contents and expand the command with the stdout.

Then I would split the result based on pipes to work out what the actual processes are with arguments.

Then I would pull out the <,>,<< and >> tokens and do the appropriate IO redirection

Then I would look for a "&" and flag if I found that to say don't wait for result.

Finally I would setup the pipe line and do the chain of forks and execs.

Quite where you choose to detect "cd", "if", "while", "do" etc is a good question.

Legend986 · October 18, 2007, 12:39am

Thank you so much... Will get to work now that I've got some suggestions from you Will let you know as and when I'm done...

Legend986 · October 24, 2007, 12:57am

Thanks to everyone here... I am slowly able to realize the shell... I'm actually stuck at piping... I am able to handle a single pipe but how do I solve the problem of multiple pipes? I know it can be solved using recursion but some pseudo algorithm will be excellent... I don't understand how to actually use recursion here... I'm currently doing something like a parent creates two children and the first one executes one command and pipes it onto the second child which displays the output...

And when I used valgrind, to my surprise I found 15 memory leaks from the piping function that I wrote and I don't understand what could've gone wrong... My pseudo code looks something like this:

	
	int fd[2]; /* provide file descriptor pointer array for pipe */
	pid_t pid1, pid2; /* process ids for each child */
	/* create pipe and check for an error */
	/* apply fork and check for error */
	    /* processing for child */
	    close (fd[1]); /* close output end, leaving input open */
	    /* set standard input to pipe */
	    if (fd[0] != STDIN_FILENO)
	    {
		if (dup2(fd[0], STDIN_FILENO) != STDIN_FILENO)
		{
		    perror("dup2 error for standard input");
		    exit(1);
		}
		close(fd[0]);
	    }
	   execlp the second function
	    //First child finished
	
	else
	{
	    /* processing for parent */
	    /* spawn second process */
	   /* apply fork again for second child*/
	  /* processing for child */
		
		close (fd[0]);
		/* set standard output to pipe */
		if (fd[1] != STDOUT_FILENO)
		{
		    if (dup2(fd[1], STDOUT_FILENO) != STDOUT_FILENO)
		    {
			perror("dup2 error for standard output");
			exit(1);
		    }
		    close(fd[1]); /* not needed after dup2 */
		}
		
		execlp the first function
		/* print to the pipe, now standard output */
	    }
	    else
	    {
		/* processing continues for parent */
		close(fd[0]);
		close(fd[1]);
		waitpid (pid1, NULL, 0); /* wait for first child to finish */
		waitpid (pid2, NULL, 0); /* wait for second child to finish */
	   }
	}

Am I doing something wrong?

porter · October 24, 2007, 1:04am

It should be simpler than the way you have it....

all that "|" means is give the write end to stdout of the left hand process and the read end to stdin for the right hand process.

but apart from > and <, the stdin/stdout/stderr should just be left alone.

To do ">" you just open a file and use it for stdout, end of story.

I personally would parse the line into a tree where each node is what I want to run in one process, each node would have pointers to where they get their stdin/stdout/stderr from.

Legend986 · October 24, 2007, 3:29am

Hmm... you're method seems more logical... But I think I've done the same thing, the only difference being that I've written a few extra lines of code I know it was a mistake that was supposed to have been rectified in the beginning but I've already written my parser and it gives me something like:

argv[0] = ls
argv[1] = -la
argv[2] = |
argv[3] = wc
argv[4] = |
argv[5] = wc

for an input like "ls -la | wc | wc"

I am able to get it to work for "ls -la | wc" but the next thing is what is puzzling me... Is it possible using the architecture that I've written?

And yeah, thanks for the tip on redirection... Will attempt that tomorrow