Secure Programmer: Keep An Eye On Inputs Common input sources cont.

Secure programmer: Keep an eye on inputs

By David A. Wheeler - 2004-01-26 Page: 1 2 3 4

Common input sources cont.

Files

As mentioned in my previous installment, don't trust filenames that can be set by an attacker. Linux and Unix allow just about any series of characters to be a filename, so if you're traversing a directory or accepting a filename from an attacker, be prepared. Attackers can create filenames with leading "-", filenames with special characters such as "&", and so on.

Don't trust file contents that can be controlled by untrusted users. That includes files viewed or edited by a program if they might be mailed by an attacker. For example, the popular text editor vim version 5.7, when asked to edit a file, would look for an embedded statusline command to set information on its status line, and that command could in turn execute an arbitrary shell program. An attacker could e-mail a specially rigged file to the victim, and if the victim used vim to read or edit it, the victim would run whatever program the attacker wanted. Oops.

Avoid getting configuration information from the current directory, because a user might view a directory controlled by an attacker who has created a malicious configuration file (for example, the attacker may have sent a compressed directory with the data and a malicious configuration file). Instead, get configuration information from /etc, the user's home directory, and/or the desktop environment's library for getting configuration information. It's a common convention to store configuration information and other information in "~/.program-name"; the period means it won't clutter normal displays. If you really must get configuration information from the current directory, aggressively check all data from it.

Don't let attackers control any temporary files. I suggest placing temporary directories inside a user's home directory if the user is trusted. If that isn't acceptable, use secure methods to create and use temporary files (I'll discuss how to securely create temporary files in a later article).

File descriptors

Sneaky attackers may start a program but do strange things to its standard input, standard output, or standard error. For example, an attacker might close one or more of them so that the next file you open is also where normal output goes. This is especially a problem for setuid/setgid programs. Some of today's Unix-like systems counter this, but not all.

One way a setuid/setgid program can counter this attack is to repeatedly open up /dev/null using open() until the file descriptor's value is more than 2 (you must do this before opening files, preferably early in the program initialization). Then, if the first call to open() returns 2 or less, exit without printing any messages. By first repeatedly opening up /dev/null, you protect yourself from yourself -- bad things won't happen if you accidentally try to open files and then print an error message. There's no need to print error messages for this case since file descriptors 0 through 2 are only closed if an attacker is trying to subvert your program.

Command line

Programs can be started up with data from the command line -- but can you trust that data? Setuid/setgid programs in particular cannot. If you can't trust the data, be prepared for anything -- large arguments, a huge number of arguments, improbable characters, and so on. Note that the name of the program is just argument number 0 in the command line values -- don't trust the program name, since an attacker can change it.

Also, try to design your command-line syntax so that it's easier to use securely. For example, support the standard "--" (double-dash) option that means "no more options," so that scripts can use the option to foil attackers who create filenames (like "-fr") that begin with dash. Otherwise, an attacker can create "-fr" as a file and try to talk users into running "yourcommand *"; your program may then misinterpret the filename ("-fr") as an option.

Graphical user interface (GUI)

Here's a recipe for disaster: a process has special privileges (for instance, if it's setuid/setgid), it uses the operating system's graphical user interface (GUI) libraries, and the GUI user isn't totally trusted. The problem is that GUI libraries (including those on Unix, Linux, and Windows) simply aren't designed to be used that way. Nor would it make sense to try to do so -- GUI libraries are huge and depend on large substructures, so it would be difficult to fully analyze all that code for their security properties. The GTK+ GUI library even halts if it detects that it's running in a setuid program, because it's not supposed to be used that way (kudos to the GTK+ developers for proactively preventing this security problem).

Does that mean that you're doomed to the command line? No. Break your program into smaller parts, have an unprivileged part implement the GUI, and have a separate part implement the privileged operations. Here are some common ways to do this:

It's often easiest to implement the privileged operations as a command-line program that's called by the GUI -- that way you get both a GUI and command-line interface (CLI) "for free," simplifying scripting and debugging. Typically the CLI privileged program is a setuid/setgid program. The privileged program, of course, must defend itself from all attacks, but this approach usually means that the part of the program that must be secured is much smaller and easier to defend.
If you need high-speed communication, start up the program as a privileged program, split it into separate processes that can securely communicate, and then have one process permanently drop its privileges and run the GUI.
Another approach is to implement a privileged server that responds to requests, and then create the GUI as a client.
Use a Web interface; create a privileged server and use a Web browser as the client. This is really a special case of the previous method, but it's so flexible that it's often worth considering. You'll need to secure it just like any other Web application, which brings us to the problem of network data.

Network data

If data comes from a network, you should usually treat it as highly untrusted. Don't trust the "source IP" address, the HTTP "Referrer" header value, or similar data to tell you where the data really came from; those values come from the sender and can be forged. Be careful of values from the domain name system (DNS); DNS implements a distributed database, and some of those values may be supplied by the attacker.

If you have a client/server system, the server should never trust the client. The client data could be manipulated before reaching the server, the client program might have been modified, or attackers could have created their own client (many have!). If you're getting data from a Web browser, remember that Web cookies, HTML form data, URLs, and so on can be set by the user to arbitrary values. This is a common problem in Web shopping-cart applications; many of these applications use hidden HTML form fields to store product information (such as price) and related information (such as shipping costs), and blindly accept these values when users send them. Not only can users set product prices to low values or zero, in some cases they can even set negative prices to receive the merchandise and an additional cash bonus. Remember that you have to check all data; some Web shopping carts check the product data but forget to check the shipping price.

If you're writing a Web application, limit GET requests to queries for data. Don't let GET requests actually change data (such as transferring money) or other activities. Users can be easily fooled into clicking on malicious hyperlinks in their Web browsers, which then send GET requests. Instead, if you get a GET request for some action other than a query, send back a confirmation message of the form "you asked me to do X, is that okay? (Ok, Cancel)" Note that limiting GET queries won't help you with the problem of incorrect client data (as discussed in the previous paragraph) -- servers still need to check data from their clients!

Miscellaneous

Programs have many other inputs, such as the current directory, signals, memory maps, System V IPC, the umask, and the state of the filesystem. Armed with the information you've gained here, the important thing is to not overlook these things as inputs, even though they don't always smell like inputs.

View Secure programmer: Keep an eye on inputs Discussion

Page: 1 2 3 4 Next Page: Conclusions & Resources

First published by IBM developerWorks