All about exit code

cdin2 · March 9, 2002, 11:22pm

Hi,

I am working on Solaris 8 and the "intro" man page says, "Upon termination, each command returns two bytes of status, one supplied by the system and given the cause for termination, and (in the case of 'normal' termination) one supplied by the program. The former byte is 0 for normal termination, ..."

So, for the two cases:
(1) normal termination -- I guess the program exit code will be passed through. But there must be a range of number the shell can "exit", right?
(2) abnormal termination -- how is the exit code calculated? XOR the two bytes of status? Did I misunderstand the intro man page?

When exiting from a shell script, should we avoid certain exit codes so that the customized exit codes will not conflict with system exit codes?

Just curious about the exit code.

CDIN

Perderabo · March 10, 2002, 1:31pm

I just read that intro page. It's rather poorly written. But an accurate answer is going to take a while.

From the standpoint of a C programmer who is using the system call interface, when a program invokes exit() or _exit(), the argument to exit is anded with 0377 and the result is the exit code for the process. If a process dies as the result of a default action of a signal it will get an exit code which is non-zero but which has non of the lower 8 bits set. You're supposed to use the macros which are mentioned on wait(2) man page to portably determine the signal number. It is one or the other. Sending a signal to a process which has started the kernel portion of the exit() system call has no effect. All 8 bits are available to the programmer resulting in values 0 through 255 being legal.

And only a process that is running can invoke the exit() call. If a C program tries to run a non-existant program or one without the proper execute bits set, the exec() system call itself will fail and the program will never run.

You are asking this question in the shell programming forum, though. A programmer who writes a shell sees the above interface. But a programmer who uses a shell is at the mercy of the shell's designer.

If you type "./perderabo" you will get an error message telling you that the shell can't find the file perderabo. But most shells will also set the exit code to some non-zero number. Nothing ran, the shell either noticed that the file wasn't there via stat() or it attempted the exec() which failed. But it will set $? (or $status) anyway. This is kinda useful I guess.

The New Kornshell Command and Programming Language by Morris Bolsky and David Korn says:

So exit codes used by the shell for processes killed by signals are not fixed even within different versions of ksh, let alone across all shells.

What I do is use "exit 0" for success and "exit n" for failure where n is a small integer. I rarely go above 10 and have never reached 20. I rely on shells to be able to determine the difference between a zero exit status and a non-zero exit status. But I almost never test for different non-zero values. I will display a non-zero exit code where it can be seen by a human who can (perhaps) use it to understand what is happening.

cdin2 · March 11, 2002, 9:03pm

Thanks for enlightening me. I guess I know the mechanism of how exit code is constructed. Just like you said, shell scripting is somewhat different. I believe certain exit codes should be avoided. For example,

chgrp mygroup b.file
if [ -z $?] then
   exit 1
fi

It worked until someone accidently changed the owner of a.file. Suddenly, it returns "1". But "1" is does not come from the if statement. It actually comes from the shell after chgrp fails.

My thinking is maybe there is a range of exit code that we should avoid. Just a guess.

Thanks for the help again.