next up previous contents
Next: I/O Redirection Operators Up: Beginning UNIX User's Guide Previous: Other Properties

I/O Redirection

 

The UNIX family of operating systems and the C programming language share many things in common. One of these sets of parallel concepts are ideas about the default source of input, and default sinks of output and errors.

For instance in C:

 int main( int argc, char *argv[] ) {
   int c;

   while( c = getchar( stdin ) != EOF ) {
     if( c > 255 ) {
       fprintf( stderr, ``Error, strange char code %d\n'', c );
       exit( 1 );
     }
     putchar( stdout );
   }
   exit( 0 );
 }
Which basically copies the stream of bytes (characters) from stdin to stdout.

A similar (in principle) program in perl is:

 #!/bin/perl -w
 my( $c ) ;

 while( $c = <STDIN> ) {
   print STDOUT ``$c'';
 }

The shell can make use of these ``standard'' input - output streams in a very powerful way. Many programs are written in such a way, that unless they are explicitly given the name of a file to open for input, they read from the stdin filehandle. And likewise, output by default goes to stdout (and error messages to a separate stderr filehandle).

The shell can massage filehandles in the following way, if a single left angle bracket (<) is present in a command line, the shell assumes the next word on the command line is the name of a file to open. So the shell opens the file for reading, and attaches this stream of bytes to the input of the program. As far as the program is concerned, it is reading from the keyboard. One place this comes in handy is with compressed (gzip) files. Gzip ``knows'' that compressed files have an extention of ``.gz''. But if your file doesn't end in .gz, gzip will happily ignore your attempts to decompress the file. If instead of asking gzip to uncompress the file, you ask gzip to uncompress a stream, it will. For example, say we have compressed_file without the magic extention:

  gzip -d compressed_file
gzip: compressed_file: unknown suffix -- ignored

  gzip -d <compressed_file >uncompressed_file
The first command fails, as gzip examines the name of the file before trying to open it. The second successfully runs, putting the uncompressed output in the file uncompressed_file.

The above example also demonstrates the redirection of the standard output (stdout) by the > operator. Some shells may complain at times when you use >. This action is determined by a shell variable called noclobber, and it is there to stop you from accidentally overwriting files. In the above example, if the file uncompressed_file already existed, it would be overwritten.

Sometimes we want to append to a file, instead of writing from the beginning. To do this we use 2 right angle brackets (>>).

Computer people are kind of strange (if you hadn't noticed by now). Another thing they do strange is count: they start at 0, not 1! Now what does this have to do with I/O redirection? Well, filehandle 0 is stdin, filehandle 1 is stdout and filehandle 2 is stderr. If > is used as written, the shell ``assumes'' that what is wanted is the redirection of stdout (filehandle 1), or 1>. If what you want is to redirect stderr to a file, you would use 2>. It is possible to send both stdout and stderr to the same file using 2>&1.

While it is not used that much with interactive shells (but is with shell scripts), there is one more redirection operator <<, which is known as a ``here is'' document. I won't do anything other than mention it here, as this is a fairly advanced thing.

So what can we do with this property of having ``standard'' filehandles? The most visible thing is what was pointed out above, that the shell can open the files for read, write or append for us, and direct the input or output to those files as indicated the presence of the < or > operator. Cut the shell can do much more for us: it can allow us to ``pipe'' (|) the output of one program into the input of another program. Under non-multi-tasking operating systems, the only way to do this is to allow the predecessor program to run to completion, storing its output in a temporary file, and then starting the subsidiary program and connecting its input to this temporary output file. Once the subsidiary program has finished running, the temporary file is supposed to be deleted.

Under UNIX, this doesn't happen. The pipe is a fairly small storage space in the operating system (typically 4096 bytes I believe). And what happens is that the operating system starts and stops the programs feeding and reading the pipe so that the the program writing to the pipe doesn't overfill it, and the program reading from the pipe is never waiting for something to read.

The above scenario, where the shell binds the output of one program to the input of another program has been extended to the idea of named pipes (for communication between 2 programs on a single computer) and sockets (for communications between 2 programs over TCP/IP).

One utility program which comes from the idea of connecting program with pipes, is that of the tee program. This program will duplicate a stream, directing one copy to stdout and another copy to a file.




next up previous contents
Next: I/O Redirection Operators Up: Beginning UNIX User's Guide Previous: Other Properties

Gordon Haverland
Sat Oct 9 13:50:48 MDT 1999