Contents:
One Bug Can Ruin Your Whole Day...
Tips on Avoiding Security-related Bugs
Tips on Writing Network Programs
Tips on Writing SUID/SGID Programs
Tips on Using Passwords
Tips on Generating Random Numbers
UNIX Pseudo-Random
Functions
Picking a Random Seed
A Good Random Seed
Generator
With a few minor exceptions, the underlying security model of the UNIX operating system - a privileged kernel, user processes, and the superuser who can perform any system management function - is fundamentally workable. But if that is the case, then why has UNIX had so many security problems in recent years? The answer is simple: although the UNIX security model is basically sound, programmers are careless. Most security flaws in UNIX arise from bugs and design errors in programs that run as root or with other privileges, as a result of configuration errors, or through the unanticipated interactions between such programs.
The disadvantage of the UNIX security model is that it makes a tremendous investment in the infallibility of the superuser and in the software that runs with the privileges of the superuser. If the superuser account is compromised, then the system is left wide open. Hence our many admonitions in this book to protect the superuser account, and to restrict the number of people who must know the password.
Unfortunately, even if you prevent users from logging into the superuser account, many UNIX programs need to run with superuser privileges. These programs are run as SUID root programs, when the system boots, or as network servers. A single bug in any of these complicated programs can compromise the safety of your entire system. This characteristic is probably a design flaw, but it is basic to the design of UNIX, and is not likely to change.
One of the best-known examples of such a flaw was a single line of code in the program /etc/fingerd, the finger server, exploited in 1988 by Robert T. Morris's Internet Worm. fingerd provides finger service over the network. One of the very first lines of the program reads a single line of text from standard input containing the name of the user that is to be "fingered."
The original fingerd program contained these lines of code:
char line[512]; line[0] = "\0"; gets(line);
Because the gets() function does not check the length of the line read, a rogue program could supply more than 512 bytes of valid data, enabling the stack frame of the fingerd server to be overrun. Morris[1] wrote code that caused fingerd to execute a shell; because fingerd was usually installed to run as the superuser, the rogue program inherited virtually unrestricted access to the server computer. (fingerd didn't really need to run as superuser - that was simply the default configuration.)
[1] Or someone else. As noted in Spafford's original analysis of the code (see Appendix D, Paper Sources), there is some indication that Morris did not write this portion of the Worm program.
The fix for the finger program was simple: replace the gets() function with the fgets() function, which does not allow its input buffer length to be exceeded:
fgets(line,sizeof(line),stdin);
Fortunately, the Morris version did not explicitly damage programs or data on computers that it penetrated.[2] Nevertheless, it illustrated the fact that any network service program can potentially compromise the system. Furthermore, the flaw was unnoticed in the finger code for more than six years, from the time of the first Berkeley UNIX network software release until the day that the Worm ran loose. Remember this lesson: because a hole has never been discovered in a program does not mean that no hole exists.
[2] However, as the worm did run with privileges of the superuser, it could have altered the compromised system in any number of ways.
Interestingly enough, the fallible human component is illustrated by the same example. Shortly after the problem with the gets() subroutine was exposed, the Berkeley group went through all of its code and eliminated every similar use of the gets() call in a network server. Most vendors did the same with their code. Several people, including one of us, publicly warned that uses of other library calls that wrote to buffers without bounds checks also needed to be examined. These included calls to the sprintf() routine, and byte-copy routines such as strcpy().
In late 1995, as we were finishing the second edition of this book, a new security vulnerability in several versions of UNIX was widely publicized. It was based on buffer overruns in the syslog library routine. An attacker could carefully craft an argument to a network daemon such that, when an attempt was made to log it using syslog, the message overran the buffer and compromised the system in a manner hauntingly similar to the fingerd problem. After seven years, a cousin to the fingerd bug was discovered. What underlying library calls contribute to the problem? The sprintf() library call does, and so do byte-copy routines such as strcpy().
While programming tools and methods are regrettable and lead to many UNIX security bugs, the failure to learn from old mistakes is even more regrettable.
In December 1990, the Communications of the ACM published an article by Miller, Fredrickson, and So, entitled "An Empirical Study of the Reliability of UNIX Utilities" (Volume 33, issue 12, pp. 32-44). The paper started almost as a joke: a researcher was logged into a UNIX computer from home, and the programs he was running kept crashing because of line noise from a poor modem connection. Eventually Barton Miller, a professor at the University of Wisconsin, decided to subject the UNIX utility programs from a variety of different vendors to a selection of random inputs and monitor the results.
The results were discouraging. Between 25% and 33% of the UNIX utilities could be crashed or hung by supplying them with unexpected inputs - sometimes input that was as simple as an end-of-file on the middle of an input line. On at least one occasion, crashing a program tickled an operating system bug and caused the entire computer to crash. Many times, programs would freeze for no apparent reason.
In 1995 a new team headed by Miller repeated the experiment, this time running a program called Fuzz on nine different UNIX platforms. The team also tested UNIX network servers, and a variety of X Windows applications (both clients and servers).[3] Here are some of the highlights:
[3] You can download a complete copy of the papers from ftp://grilled.cs.wisc.edu/technical_papers/fuzz-revisited.ps.Z.
According to the 1995 paper, vendors were still shipping a distressingly buggy set of programs: "...the failure rate of utilities on the commercial versions of UNIX that we tested (from Sun, IBM, SGI, DEC, and NeXT) ranged from 15-43%."
UNIX vendors don't seem to be overly concerned about bugs in their programs: "Many of the bugs discovered (approximately 40%) and reported in 1990 are still present in their exact form in 1995. The 1990 study was widely published in at least two languages. The code was made freely available via anonymous FTP. The exact random data streams used in our testing were made freely available via FTP. The identification of failures that we found were also made freely available via FTP; these include code fragments with file and line numbers for the errant code. According to our records, over 2000 copies of the...tools and bug identifications were fetched from our FTP sites...It is difficult to understand why a vendor would not partake of a free and easy source of reliability improvements."
The two lowest failure rates in the study were the Free Software Foundation's GNU utilities (failure rate of 7%) and the utilities included with the freely distributed Linux version of the UNIX operating system (failure rate 9%).[4] Interestingly enough, the Free Software Foundation has strict coding rules that forbid the use of fixed-length buffers. (Miller et al failed to note that many of the Linux utilities were repackaged GNU utilities.)
[4] We don't believe that 7% is an acceptable failure rate, either.
There were a few bright points in the 1995 paper. Most notable was the fact that Miller et al. were unable to crash any UNIX network server. The group was also unable to crash any X Windows server.
On the other hand, the group discovered that many X clients will readily crash when fed random streams of data. Others will lock up - and in the process, freeze the X server until the programs are terminated.
Many of the errors that Miller's group discovered result from common programming mistakes with the C programming language - programmers who wrote clumsy or confusing code that did the wrong things; programers who neglected to check for array boundary conditions; and programmers who assumed that their char variables were unsigned, when in fact they are signed.
While these errors can certainly cause programs to crash when they are fed random streams of data, these errors are exactly the kinds of problems that can be exploited by carefully crafted streams of data to achieve malicious results. Think back to the Internet Worm: if attacked by the Miller Fuzz program, the original fingerd program would have crashed. But when presented with the carefully crafted stream that was present in the Morris Worm, the program gave its attacker a root shell!
What is somewhat frightening about the study is that the tests employed by Miller's group are among the least comprehensive known to testers - random, black-box testing. Different patterns of input could possibly cause more programs to fail. Inputs made under different environmental circumstances could also lead to abnormal behavior. Other testing methods could expose these problems where random testing, by its very nature, would not.
Miller's group also found that use of several commercially available tools enabled them to discover errors and perform other tests, including discovery of buffer overruns and related memory errors. These tools are readily available; however, vendors are apparently not using them.
Why don't vendors care more about quality? Well, according to many of them, they do care, but quality does not sell. Writing good code and testing it carefully is not a quick or simple task. It requires extra effort, and extra time. The extra time spent on ensuring quality will result in increased cost. To date, few customers (possibly including you, gentle reader) have indicated a willingness to pay extra for better-quality software. Vendors have thus put their efforts into what customers are willing to buy, such as new features. Although we believe that most vendors could do a better job in this respect (and some could do a much better job), we must be fair and point the finger at the user population, too.
In some sense, any program you write might fare as well as vendor-supplied software. However, that isn't good enough if the program is running in a sensitive role and might be abused. Therefore, you must practice good coding habits, and pay special attention to common trouble spots.