The Whole Story on #! /usr/bin/ksh

Introduction

Originally, we only had one shell on unix. When ran a command, the shell would attempt to invoke one of the exec() system calls on it. It the command was an executable, the exec would succeed and the command would run. If the exec() failed, the shell would not give up, instead it would try to interpret the command file as if it were a shell script. This works fine as long as there is only one shell on the system. But what if you are using one shell as your interactive and want to run a script written in another shell's language?

This is where the #! trick comes in. The idea of using # to represent a comment originated with csh and was quickly added to the bourne shell. Now all shells know to ignore stuff after a #. So we can add a leading line something like "#! /usr/bin/ksh". To the shell this line is just a comment. But if the kernel tries to execute a file with this line, it will exec the specified interpreter and pass the script to it. So shell scripts become executable pretty much like real executables are. Now when a shell tries to exec a shell script, it succeeds.

What if you leave off the #! line? Well, the kernel exec will fail. Your average shell will then try to run the script itself. A few shells will try to inspect the script to try to guess the language. This is not good. You may be running ksh as your interactive shell and writing ksh scripts. If you later switch to bash as an interactive shell, some of your scripts may continue to run while others may fail. Also the line is a comment that provides an important clue to a programmer who looks at the script to understand it. Knowing which language the author is attempting to use is a big help.

While I have used the term "shell", actually this technique can be used with many programs that are not shells. Here is a "script" to display a multiline message:

#! /usr/bin/cat
Line 1
Line 2 
Line 3

This will display that "#! /usr/bin/cat" line, but other than that, it works fairly well.

Passing an Argument

You can pass a single argument like this:
#! /usr/local/bin/perl -w
But, in general, you are limited to one argument. On most systems, a line like: "#! /some/interpreter -a -b" will result in "-a -b" being passed as a single argument. However, the single argument is not limited to starting with a hyphen. We can improve on our message script:

#! /usr/bin/sed 1d
Line 1
Line 2 
Line 3

Example

Let's put all of this together with an example. Here is a perl script that I will call perlargs:

#! /usr/local/bin/perl -w

#! /usr/local/bin/perl -w
print "script name is ", $0, "\n";
while (@ARGV) {
        $ARGV = shift @ARGV;
        print "argument ", $i++,  " is ", $ARGV, "\n";
}

system "ps -f -ww";

The -w asks perl issue warning messages. The script simply displays its arguments, then runs the ps command. When I run it, I get:

$ ./perlargs one two three
Name "main::i" used only once: possible typo at ./perlargs line 10.
script name is ./perlargs
argument 0 is one
argument 1 is two
argument 2 is three
                 UID        PID   PPID    STIME TTY     TIME CMD
                 perderabo   69      1 17:47:28 n01  0:00.24 /bin/ksh -l
                 perderabo  201     69 18:28:22 n01  0:00.03 /usr/local/bin/perl -w ./perlargs one two three
                 perderabo 2055    201 18:28:22 n01  0:00.01 ps -f -ww
$

Notice that the perl process was called with 5 arguments. The 2nd argument is the name of the script. It is up to the perl process to to present the final 3 arguments as the argument list seen by the script. Also realize the kernel started the perl process. After that, it is up to perl to open the script and read it and execute each line. This is why scripts need to be readable. You cannot execute a non-readable script.

Conclusion

This should be enough information to understand what is happening with those #! lines. In the following posts, I will add details on various aspects of the process.

3 Likes

The Format of the #! Line

We can say for certain that the first 2 characters must be "#!". Or can we? Many systems, it seems, are willing to delete leading white space. My recommendation is to start with #!. It's traditional.

Next there may be an optional space. Some documentation says this space is required but as far as anyone can determine the only Unix release to require the space was a snapshot release of BSD 4.1... this was not a general release). Actually, it appears that you may have several spaces if you want. And some testing with TAB characters has been done and seems to work. My recommendation is to stay with zero or one spaces.

Next comes the full path to the interpreter and like all full paths, it must start with a /. Oops, another exception... The Linux kernel (at least version 2.0.34) is willing to accept a relative path. My recommendation is don't do that.

We may be done. Or we may have optional white space which lead to our single argument. Except that some versions of FreeBSD handle multiple arguments.

Most versions of BSD and HP-UX will strip trailing white space. Other versions of Unix treat trailing white space as valid characters. And a few versions of BSD can accept a trailing comment delimited by a # character.

How long can the line be? A few versions of Unix set the limit as low as 32 characters. FreeBSD can apparently handle 8192 characters.

At least the line always ends with the Unix standard \n character, right? Well, not always. Some versions of Unix will tolerate a \r\n ending and strip off the \r while others won't do that.

This is not as standard as it could be...

Argument 0 of The Process, Not The Script

There is another way that implementations may differ. Consider the perl script that I ran ar the end of part 1. My shell did the approximate equivalent of
execl("./perlargs", "./perlargs", "one", "two", "three", (char *) NULL)
and the kernel transformed it into the approximate equivalent of
execl("/usr/local/bin/perl", "/usr/local/bin/perl", "-w", "./perlargs", "one", "two", "three", (char *) NULL)

Highlighted in red is argument zero which by convention is the same as the path of the program being executed. A notable execution is that the login program will set it to stuff like "-ksh". Originally, executable shell scripts had the argument 0 set to the name of the script rather than the name of the interpreter. These days, the name of the interpreter is common. The last hold-out I know of is HP-UX which sets argument 0 to the name of the script.

3 Likes

When executable shell scripts were first introduced they honored the suid bit. This was a disaster for security. Consider a script call /usr/sbin/disaster. And we did:
chown root /usr/sbin/disaster
chmod 4755 /usr/sbin/disaster
Naturally, /, /usr, /usr/sbin are only writable by root. And the script itself is just:
#! /bin/sh
That's right. The script is a single line with no hidden characters. The shell will ignore it because it is a comment and then exit because there are no other lines. This script is already insecure and is vulnerable to two different attacks.

Attack 1: link to -i

If we do:
ln -s /usr/sbin/disaster ./-i
and arrange to execute our new symbolic link called "-i". If we execute the real /usr/sbin/disaster, we do the equivalent of:
execl("/bin/sh", "/usr/sbin/disaster", "/usr/sbin/disaster", (char *) NULL)
No problem there. That is what we expected. But when we run copy special link called "-i" we do:
execl("/bin/sh", "-i", "-i", (char *) NULL)
This causes the suid shell to behave as an interactive shell! The shell was quickly modified to accept a single hyphen as the end of switch setting arguments. So we modify our one line script to read:
#! /bin/sh -
Now the same trick would result in:
execl("/bin/sh", "-i", "-", "-i", (char *) NULL)
Augument zero is still "-i", but this is harmless. It might even be considered useful because it makes this attack more noticable with the ps command.

Attack 2: changing the link

This one is harder to explain. As before, we make a symbolic link to the script under attack, but this time the name does not matter. Then we run the script via our new symbolic link. Then we quickly change the symbolic link to point to an evil script. We want to change what the symbolic link points to after the kernel opens it but before the interpreter opens it. This is a race condition and it will not work every time. It also requires a bit of clever programming to pull this one off. But done correctly, we will have our evil script running as root.

Suid Scripts Disabled

Because it was not possible to write a secure suid shell script, the concept of suid shell scripts was removed from Unix. Around this time the program sudo was written and this largely oviated the need for suid shell scripts. I don't believe any version of unix released in the past 15 years has these problems. (And if I did, I would not have discussed these attacks. ;)) By now I hope you can see why many old time system admins (such as myself) still have a dim view of suid shell scripts.

The Return of Suid Scripts

Solaris now supports suid shell scripts but it is immune to these attacks. It does this by ensuring that the script is opened only once in the case of a suid script. If the suid bit is set, Solaris uses the fd filesystem to pass the script to the interpreter. Had my perlargs script been suid on Solaris, it would have been run something like:
execl("/usr/local/bin/perl", "/usr/local/bin/perl", "-w", "/dev/fd/3", "one", "two", "three", (char *) NULL)
and when perl opened /dev/fd/3, it just gets another open file descripter pointing to the same file as whatever has been opened as fd 3. Note that inside the script, there is no way to obtain the name other than /dev/fd/3.

Because the name used is concealed from the intrepreter, there is no harm if that name was something odd like "-i". Because the script is opened one time, there is no harm if a symbolic link is suddenly switched to an evil script.

But note that this finally gets us to a stage where a one line script containing only a comment can be safely run. There can still be other security problems with a poorly written suid script. If you must use suid shell scripts, here are a few tips:

  1. Use ksh
    Dave Korn analysed all of the attacks on shell scripts and closed as many holes as he could. In particular, ksh is immune to IFS based attacks. Also if it finds that it is going interactive with an effective uid of root together with a real uid which is not root, it will use its root authority to set the effective uid to the real uid prior to issuing a prompt.

  2. Control PATH
    Explicitly set your PATH variable as the first step in your script. Make the list of directories as short as possible. Do not start or end the list with a colon or have two consecutive colons. And do not put . or .. in the list. Explicitly export PATH. And ensure that every directory mentioned in PATH is writable only by root. For example, /usr/local/bin is on the PATH, then in addition to the obvious need for /usr/local/bin being non-writable, you also need to ensure that /, /usr, /usr/local are all not be writable. If /usr/local is a mounted filesystem, it must not be possible for a non-root user to arrange for /usr/local to be unmounted. You can further protect your script by relying on PATH as little as possible. So, for example, use "/usr/bin/rm" rather than just "rm".

Note: while I used /usr/local/bin as an example, I would strongly resist putting /usr/local/bin in the PATH of a suid script. Again: make the list of directories as short as possible. I would rarely go beyond "PATH=/usr/bin" in a suid script.

  1. Control IFS
    Set IFS to a space, a tab, and a newline. And then export IFS. While ksh is immune to the IFS attack, some other shells are not immune. Something you do in your script may indirectly invoke another shell.

  2. Make sure that the script is not writable
    Most versions of unix these days will remove the suid bit on a file as it is being written by a process owned by a user different than the owner of the file. But don't depend on that behavior.

  3. Do not execute directly or indirectly any user supplied input
    You must ensure that user input is never delivered to the shell to interpret as executable code. If you don't know how to ensure this, then you should not process any user input. Note that the parameters of the script must be considered user input.

  4. Be careful if you invoke programs that solicit input from the user
    Do not invoke programs that allow the user to execute arbitrary programs. For example, do not ask the user to vi a file. vi offers a way for the user to execute a subshell. Yes, ksh's built-in protection will protect you if the user invokes ksh interactively. But, unfortunately, other shells exist. And the user doesn't need an interactive shell with vi. A command like ":!rm /etc/passwd" will work from vi and this is not using an interactive shell.

These steps will go a long way toward securing any suid scripts you write, but I can't guarantee that they are enough to completely secure a script.

2 Likes

Invented by Dennis Ritchie

The concept originated with Dennis Ritchie:

>From dmr Thu Jan 10 04:25:49 1980 remote from research
The system has been changed so that if a file being executed
begins with the magic characters #! , the rest of the line is understood
to be the name of an interpreter for the executed file.
Previously (and in fact still) the shell did much of this job;
it automatically executed itself on a text file with executable mode
when the text file's name was typed as a command.
Putting the facility into the system gives the following
benefits.

1) It makes shell scripts more like real executable files,
because they can be the subject of 'exec.'

2) If you do a 'ps' while such a command is running, its real
name appears instead of 'sh'.
Likewise, accounting is done on the basis of the real name.

3) Shell scripts can be set-user-ID.

4) It is simpler to have alternate shells available;
e.g. if you like the Berkeley csh there is no question about
which shell is to interpret a file.

5) It will allow other interpreters to fit in more smoothly.

To take advantage of this wonderful opportunity,
put

	#! /bin/sh

at the left margin of the first line of your shell scripts.
Blanks after ! are OK.  Use a complete pathname (no search is done).
At the moment the whole line is restricted to 16 characters but
this limit will be raised.


From uucp Thu Jan 10 01:37:49 1980

BSD picked it up from Version 8 of Research Unix and made it popular.

What to Call the Concept

Eventually, the #! line came to be called the "sharpbang" line and sharpbang can still be found in some kernel source code. This was shortened to "shebang" somehow. I would have expected "shabang" but it seems that "shebang" is what most people use. A few people seem to use "hashpling". Apparently, SCO has a switch called "hashplingenable" which must be set at kernel build time to enable this feature. And I have been told that some folks use the term "hashbang". (Anyone for sharppling?)

Special Rules for Perl

Recall that multiple arguments on the #! line are often passed as a single argument. If perl is passed an argument like "-a -b". it will break it apart and act like "-a" and "-b" had been passed. Also perl does not ignore the #! line. Perl will inspect it and will turn on any switches it finds. This helps compensate for systems with a very short #! line limit. perl will scan a #! line until it finds the string "perl". Then it starts scanning for switches. During this scan, "-*" and "- " are ignored. If the #! line does not contain the string "perl", perl will process the #! line the way a Unix kernel would and thus invoke the proper interpreter.

Handling Different Paths

Let's say some systems have, for example, /opt/perl/bin/perl and /usr/local/bin/perl. How to write a perl script? My feeling is that it falls on the System Administrator to provide the needed commonality. I would put perl in /usr/local/bin. If the perl installation procedure puts perl in /opt/perl/bin/perl, that is fine... but then I would a symbolic link in /usr/local/bin pointing to the perl executable. Same thing with perl. So on all systems under my control "#! /usr/local/bin/python" is guaranteed to work or python is not available on that system. Same thing with perl, etc. It is rare for me to add a link to a directory like /usr/bin. But I will for do this for ksh and bash should an OS be missing either shell. And I always ensure that bash and ksh will work if stuff like /opt or /usr/local is unmounted. These shells are the only cases where I will override the installation procedure if I must. So on my systems stuff like this will work:

#! /usr/bin/ksh
#! /usr/bin/bash
#! /usr/local/bin/perl

And until this stuff works, I do not consider the OS installation to be complete. Scripts imported from the outside world may need a tweak to run if they expect the interpreters elsewhere. But programs like "configure" will find perl if it is in /usr/local/bin. So this is my approach. But there are other ideas...

Another approach is to use a #! line that result in an intermediate program invoking the desired interpreter after a PATH search. The most frequently used example would be something like "#! /usr/bin/env perl". This assumes that env is in /usr/bin (not the case with, for example, Unicos, but my above comments apply to env and I would add the required links). It also assumes that perl is on the PATH somewhere. A more elaborate example is

#! /usr/bin/sh -- # -*- perl -*- -p
eval 'exec perl -S $0 ${1+"$@"}'

And even more elaborate examples are in the book Programming Perl by Larry Wall, et al.

And the Posix Standard on this page checks in with

See the linked page for their suggested installation script.

The ksh expanded environment

Let's say that you use ksh exclusively and you have a script and you leave off the "#!". Your interactive ksh will try to exec the script and fail. So your interactive ksh with fall back to running your script as a ksh script. It does this by forking a copy of itself to create a subshell. This forked subshell can then, in effect, expand on the standard Unix concept of "environment". When a process exec's another process, the new program inherits a bunch of stuff, open files, current directory, and the ENVIRONMENT. The ENVIRONMENT is a set of strings which, by convention, take the form of variable settings, like: PATH=/usr/bin:/usr/local/bin. But the forked ksh subshell knows everything about the parent subshell. Early ksh version exploited this by having ways to export arrays, aliases, and functions. This requires avoiding the #! line and thus living with the problems this creates. Most sites strongly encourage the use of the #! line and this undermines the ksh expanded environment concept. And it confused beginners who saw stuff like exported aliases mentioned in the docs. Recent versions of ksh have retreated from the expanded environment concept. So my suggestion is to forget about ksh exported arrays, aliases, and functions and use that #! line.

4 Likes