Short answer: because in UNIX everything is a file.
Long version: picture a process to be a garden hose. You pour water (data) into it at the top, inside something happens (the data is processed in some way), then the result comes out at the bottom.
Whatever comes out will land in a file called /dev/ttyX
(or something similar, depending on your OS) which resembles the video hardware you are sitting at. Type something on your keyboard and some driver will move the typed characters into this file (from where some program - usually the shell - will pick them up), have the shell generate some output and it will land there, from where a driver picks it up and displays it on your screen. These two drivers basically constitute what is called a "terminal emulator".
To come back to the water hose picture: with redirection you can decide which device to attach to the various endings of the hose. Consider the command:
# ls
We usually say "it displays a directory listing", but in an absolute sense this is not the case: what it does is to generate a data stream with the directory information. We have a "hose" where nothing goes in and a stream of characters (the directory listing) comes out. Per default all the processes started from the shell are redirected to /dev/tty
and this is why the generated data is displayed on your screen. But if we want it somewhere else we could re-redirect it somewhere else:
# ls > /some/file
Now we have attached a "different bucket" to the ending of the water hose and the data lands now in it. The same way we could redirect the input to a process. Because ls
does not want or need any input we use grep
for that:
grep "word" >/some/output </some/input
We have attached the file "/some/input" to the opening of the hose so the data in this file run into it. Inside "grep" does its work (it filters out lines containing "word", all others are dropped) and the result goes into another file attached to the bottom of the hose: "/some/output".
Now this is all fine but how about connecting a hose not to a bucket but another hose? We can do that too: this is called a "pipeline" and the symbol is "|". Let us have a look:
# ls | grep "myfile"
We have the first process ls
which has no input but some output. This output is directly connected to the input of another process, grep
, which further processes what ls
emits. This now lands on the screen because of the default redirection i told you above, but we could further redirect this to another file or - by another pipeline - to another process.
It is even possible to create a filesystem representation of this pipeline: It looks like a file but in fact it is just a name where the output of one command is buffered until another command picks it up and processes it. This is called a "named pipe" and the command to create one is mkfifo
.
Finally i want to confuse you hopelessly: the water hoses (processes) in UNIX are weird because they have, per default, not two openings, but three: stdin, stdout and stderr. Consider the hoses being Y-shaped, with two outlets, not one.
UNIX-processes use stdin for input. This is per default the keyboard as you can see when typing the command:
cat > /some/file
You will notice that this seems to "hang". In fact it does not hang but waits for input - your keystrokes. Type something and you will see that. Finally press CTRL-D, which creates the "END-OF-FILE"-character and all you have typed you will find in the file named "/some/file", which was what we redirected stdout to. Would we have not redirected it it would have landed on the screen - again, the default redirection.
This leaves stderr for explanation: this is where commands write their diagnostic messages to. It can be redirected the same way you can redirect stdout. Some are confused because these two are per default redirected both to the screen, so that both types of output land there. But if two different hoses deliver into the same bucket it doesn't mean they are identical! So let us try to separate them visibly:
# ls -l /etc/hosts /bla/foo/bar
Because "/etc/hosts" is a file that exists on practically every UNIX system and chances are "/bla/foo/bar" will not exist in yours you will get an output like this:
# ls -l /etc/hosts /bla/foo/bar
ls: cannot access '/bla/foo/bar': No such file or directory
-rw-r--r-- 1 root root 220 Jul 23 17:34 /etc/hosts
Notice that the first line has come via stderr, the second via stdout. No we will redirect away (to /dev/null
, a file which devours everything sent to it - the trash can) the various parts:
# ls -l /etc/hosts /bla/foo/bar >/dev/null
ls: cannot access '/bla/foo/bar': No such file or directory
We have redirected stdin so that will not land on screen any more.
# ls -l /etc/hosts /bla/foo/bar 2>/dev/null
-rw-r--r-- 1 root root 220 Jul 23 17:34 /etc/hosts
Here we have redirected stderr. Notice that stdin, stdout and stderr are so-called "I/O-descriptors" and also numbered: 0 is stdin, 1 is stdout and 2 is stderr. This is why "2>" redirects stderr, the I/O-descriptor 2. We have left out 1 for our redirections up to now because it is the default but to be ultra-super-duper-correct we would write:
# ls -l /etc/hosts /bla/foo/bar 1>/dev/null
ls: cannot access '/bla/foo/bar': No such file or directory
to redirect stdin.
I am already at the end, just one more thing: it would be nice to have the possibilty to redirect an I/O-descriptor to where another is already redirected. There is such a device:
# ls -l /etc/hosts /bla/foo/bar >/some/file 2>&1
The first rediction sends all stdin to "/some/file". The second redirects stderr to whereever stdin is already pointing at - in this case into the same file. Notice, though, that order matters! All these redirections are interpreted from left to right! So this:
# ls -l /etc/hosts /bla/foo/bar 2>&1 >/some/file
Will not do the same as the command above, because first stderr will be redirected to where stdin points at - the screen - and only then stdin will be redirected to "/some/file", but this will not affect stderr at all.
I hope this helps.
bakunin