Some questions regarding old if.c

orbit · December 31, 2014, 2:19pm

Hey
I have some questions regarding the old unix if command.
(see man pageman.cat-v.org/unix-6th/1/if

Here I uploaded the source code: (it's a bit too long, to put it here)
pastebin.com/bj0Hvfrw

Now my questions:

1.) Line 14: The function exp() is called with no arguments. But the function is declared as exp(s), so it needs an argument. Why is this working?

2.) What's happening there with exp() -> e1() -> e2() -> e3()... I think it's called recursive descent parsing, but I don't really get it. Could you help me a bit there.

ongoto · December 31, 2014, 2:44pm

Could you show us some code and usage of this 'old if.c' and explain what you are trying to do with it? That pastebin stuff is nonsense.

orbit · December 31, 2014, 2:58pm

I am trying to understand the whole source code. I would like to know what each functions does, what each line does. So I picked some lines I do not understand and wrote the questions here.

Why is it nonsense? It's the acutal source code of the command if (unix ver.6).
How to use the command in unix is described in the posted man page.

ongoto · December 31, 2014, 3:45pm

I meant no offense.
I'm just going along with what you said. The questions you raised supports the fact that it doesn't make any sense, right?

if (exp()) is asking if the function exists; it's not calling that function.
p1 is not assigned to e1(), e2(), etc; p1 is a pointer to (the address of) those functions. By doing that it's redirecting the search for special characters (-r, -w, -c, etc.). The logic is: "If you don't find what you want here, I'll provide you another place to look", and so on.

A typical if statement would be something like...
if ( x = y ) { then do something
The source you provided checks for the characters '{} () = !=' and other options. In other words it's just checking for proper syntax and storing arguments for some later action. That's about all it's good for. Eventually it will return true or false or error.

achenle · January 1, 2015, 12:35pm

The source code is old K&R C, without function declarations.

Don't write code like that, and don't ever modify old K&R C by adding function declarations - unless you like getting into the intricacies and implications of C variable promotion rules, and how they may have changed over the years.

Original K&R C just took all arguments to a function, promoted them so they'd all be the same size, and stuffed them on the stack.

I think, if arguments aren't declared after the first function definition line:

main(argc, argv) <--defintion
char *argv[];  <-- argument declaration
{
    ....

then the argument implicitly defaults to "int".

Basically, in K&R C all functions are called as variable argument functions with every argument promoted to the same size, and the arguments aren't type-checked. Ever. And argument declarations in the function definitions only tell the function how to interpret the data in the variable passed - whatever that value may be, with, again, no type checking.

A "declaration" is code that tells the compiler what something is - think of it as a customs declaration for a bottle of booze - you're telling customs that you have a bottle of booze somewhere in your luggage, and what it is. It's not the bottle itself.

A "definition" is code that IS the function or variable. It's the bottle itself.

K&R C has pretty much no declarations. No one knows what anything else is. Try making drinks without knowing in advance what's in every bottle of booze...

orbit · January 3, 2015, 8:27pm

Thank you ongoto and thank you achenle for your great explanation

I worked really hard on understanding the code and I nearly got everything.
This is the last part I do not understand:

if(eq(a, "{")) { /* execute a command for exit code */
    if(fork()) /*parent*/ wait(&ccode);
    else { /*child*/
        doex(1);
        goto err;
    }
    while((a=nxtarg()) && (!eq(a,"}")));
    return(ccode? 0 : 1);
}

As described in the man-page (if page from Section 1 of the unix-6th manual), if we put the command in brackets "if expr { command } ", we can obtain his exit code.

So we fork the current process, and then wait for our child process to finish? But where is our child process continuing his work? After the fork, we will go into the while-loop and and just skip some arguments and then return with ccode? Where was ccode changed? What is ccode?

Could you please explain me this the given code snippet? And elaborate on ccode?

The man page of wait: wait page from Section 2 of the unix-6th manual
The source code: [C] code - Pastebin.com

Thank you very much

ongoto · January 3, 2015, 11:29pm

Part of an if statement can be to call (fork) an external function; e.g. sed. You could say a == (the results of) sed /something/ for example. ccode is just the name of a variable. Judging by the way it's used here, ccode could be described as a container for:

The child process will be running in it's own "shell" program space and then return it's status (success or failure).
return(ccode? 0 : 1); is a ternary expression. (You can look up ternary). ? indicates a test, 0 is returned if test is true, 1 if false. I think in this case, true would indicate no errors.

Don_Cragun · January 4, 2015, 1:10am

orbit:

Thank you ongoto and thank you achenle for your great explanation

I worked really hard on understanding the code and I nearly got everything.
This is the last part I do not understand:
if(eq(a, "{")) { /* execute a command for exit code */
   if(fork()) /*parent*/ wait(&ccode);
   else { /*child*/
   doex(1);
   goto err;
   }
   while((a=nxtarg()) && (!eq(a,"}")));
   return(ccode? 0 : 1);
}
As described in the man-page (if page from Section 1 of the unix-6th manual), if we put the command in brackets "if expr { command } ", we can obtain his exit code.

So we fork the current process, and then wait for our child process to finish? But where is our child process continuing his work? After the fork, we will go into the while-loop and and just skip some arguments and then return with ccode? Where was ccode changed? What is ccode?

Could you please explain me this the given code snippet? And elaborate on ccode?

The man page of wait: wait page from Section 2 of the unix-6th manual
The source code: [C] code - Pastebin.com

Thank you very much

If you look closely, I think you'll find that that is " if { command } " (with no expr); the exit status of command , in this case, is the expression.

ongoto already explained most of what is going on in the code above. From your comments above, I get the feeling that you don't understand how fork() works. If we look at the code:

    if(fork()) /*parent*/ wait(&ccode);
    else { /*child*/
        doex(1);
        goto err;
    }
    while((a=nxtarg()) && (!eq(a,"}")));
    return(ccode? 0 : 1);

What you have to understand is that after a successful call to fork() , it essentially returns twice; once in the parent process and once in the child process. In the parent process (the one that called fork() ), the return code is never zero; if it is positive, it is the process ID of the newly created child process, and if it is negative, it indicates that the fork was unsuccessful and errno will indicate what error occurred (in this case there is no child process).

In the child process, fork() always returns zero.

So the code shown in green above, is only executed in the child process and the code shown in red is only executed in the parent process. Presumably, the function doex() in the child will parse the expression between the { and } , evaluate it, and then branch to the code at the label err which should terminate the child process with an exit code indicating whether the expression evaluated to true or false.

The wait(&ccode) in the parent waits for the child to exit and saves the exit status from the child in the variable ccode . Then the while loop in the parent skips over the expression that was evaluated by the child and returns true or false depending on the exit status of the child.

Note also that exit code 0 conventionally indicates success and a non-zero exit code indicates a failure. So, the command:

test -r file

produces exit status zero if file is readable and produces a non-zero exit status if file can't be found or is not readable. As ongoto explained, the ternary operator converts the exit code convention (0 for success, non-zero for failure) into the C convention (1 for true, 0 for false).

achenle · January 4, 2015, 1:41am

The more I think about it, the less sure I am about that. As K&R C treats pretty much all functions as variable-argument, how could the compiler know what you meant? There's no way for a compiler to know if you're doing an existence check which would simply evaluate to the address of the function, or call the function with zero arguments.

I think there's a good chance that code that you identified actually does make the call to exp(), with unknown data on the stack. I think to just check if the function exists, the code would be if ( exp ) .

Either way, anyone who writes code like that without comments on WHAT is being done is being incompetent, in my opinion.

When you go out into the esoteric edges of a programming language like that, even you are likely to not remember exactly what you did later on. And everyone else who didn't write the code is almost certainly going to be stumped for a good bit. Production or library code is not the place to compete in obscure coding ego wars.

And I freely admit that calling a vararg function with zero arguments makes no sense - with undefined junk on the stack, there's no way to know what would happen. That's another reason the code should be commented. To figure out what exactly is going on requires breaking out the C standard and maybe even hardware-specific behavior because "putting variables on the stack" is highly hardware-specific.

orbit · January 4, 2015, 5:44am

Thanks again ongoto, that helped me.
And thank you Don Cragun, for this detailed answer, I really appreciate it

@achenle

Notice that the parameter s in exp(s), isn't used at all in the function. So there should not be a problem calling exp(). So was it just a "mistake" by the programmer?

My last question now would be, how could I print the exit status of a command in unix v6?
I know nowadays you can do it with $?. But how did they do it at that time?

ongoto · January 4, 2015, 9:09am

if (exp()) is asking if the function exists; it's not calling that function.
...
I think there's a good chance that code that you identified actually does make the call to exp(), with unknown data on the stack. I think to just check if the function exists, the code would be if ( exp ) .

That is possible. I can't remember all the type checking details used in those days either. One way Bash (nowadays) distinguises between a variable named 'func' and a function named 'func()' is the parentheses. But that may not have been true 20 years ago in the C language.

achenle · January 4, 2015, 9:30am

Without any comments from the programmer stating the intent of the code, it's impossible to tell.

Depending on your command shell, it's usually something like this:

-bash-4.1$ command arg1 arg2 ...
-bash-4.1$ echo $?

"$?" is usually the value returned from the last command run in a shell. csh is probably different, but I don't remember exactly.

Don_Cragun · January 4, 2015, 12:43pm

achenle:

Without any comments from the programmer stating the intent of the code, it's impossible to tell.

Depending on your command shell, it's usually something like this:
-bash-4.1$ command arg1 arg2 ...
-bash-4.1$ echo $?
"$?" is usually the value returned from the last command run in a shell. csh is probably different, but I don't remember exactly.

One thing Bourne based shells and csh based shells share is that $? is the exit code of the previous command. But, we're talking about a UNIX Version 6 shell here; not a modern shell. I never used V6 much (although I worked on developing PWB UNIX and MERT both of which were derived from V6). I used the Mashey shell for a while and early versions of the Bourne shell (both of which were developed as replacements for the V6 shell). (As you can probably guess from current shells, the Mashey shell lost out to the Bourne shell; but some Mashey shell features were adopted by the Bourne shell.)

The V6 shell was VERY simple. (You can check out V6 utility man pages here.) Although there is no mention of shell variables or special parameters (e.g., $# , $? , and $! ) on the V6 sh(1) man page and it is clear from the description of command execution that there was no $PATH variable, I think the special parameters listed above were provided with their current meanings. Positional parameters aren't even mentioned on the sh(1) man page, but $1 is mentioned on the goto(1) and shift(1) man pages. There were no loops ( for , while , or until ), but the the goto command could be used with the if command to produce loops. The only file descriptors you could redirect were standard input and standard output. (The BUGS section of the man page says it is a bug that diagnostic messages couldn't be redirected.) The Synopsis for the if command was:

if expr command [args...]

Note that there is no then , else , or fi here. And, as has been noted in this thread, if was a standalone utility; not a shell built-in. If I remember correctly, the only built-ins in V6 sh were chdir (there was no cd then), login , newgrp , shift , and probably exit .

orbit · January 4, 2015, 11:15pm

Don Cragun is right, I got a pdp11 simulator working, and there is no $?.

Thanks again

After reviewing the code, I stumbled over a last thing I do not understand.

char *nxtarg() {

	if (ap>ac) return(0*ap++);
	return(av[ap++]);
}

This is the function that returns the next argument. ac is the number of arguments in argv and ap is the index of the current argument.

So I think with

if ( ap > ac )

we are testing, if there are any more arguments. And here comes my question:

Why would you increment ap, if ap is already bigger than ac, and therefore no more arguments?

Source Code: [C] code - Pastebin.com