CGI Security : Better Safe than Sorry
Creating a Page Counter In Perl
Speed Thrills : CGI Please ... and Fast!
Server-Side Includes and its Extensions
Random and Recursive Crypting using Salt on Unix and Win32
Creating a mailing list using Perl
You are here: irt.org | Articles | CGI & Perl | CGI Programming Made (Relatively) Easy Using Libraries [ previous next ]
Published on: Sunday 4th July 1999 By: Pankaj Kamthan
With its first implementation in the NCSA WWW server, CGI has become a powerful and widely used standard for interfacing external applications with WWW servers. A plain HTML document that the WWW server retrieves is static. A CGI script, on the other hand, is executed in real-time, so that it can output dynamic information.
Although any programming language can be used to write CGI scripts, Perl has established itself as the lingua franca for CGI programming. The purpose of this article is to give a insight into the use of CGI libraries, in particular, by introducing some of the powerful features of CGI.pm. To avoid any confusion, it should be noted that these libraries will not make the path to learning a language of choice for writing CGI scripts any easier. The libraries require that you are already familiar with the language (Perl in our case) in order to use them.
We assume that the reader has some background in CGI programming, preferably in Perl 5, and elementary knowledge of HTML. For a tutorial introduction to CGI, see the Introduction to CGI at NCSA. For a primer on Perl objects, see Easy Intro to Using Perl Objects. For definitive treatments on the subject, see the list of references.
This section briefly describes the CGI libraries in form Perl 5 modules that are currently in use. All of them are freely available.
cgi-lib.pl is perhaps the oldest and most widely used CGI library for script processing. Originally a Perl 4 library, it is now upgraded to Perl 5. The cgi-lib.pl library makes CGI scripting in Perl simple to learn and easy to use, and is a good starting point for migration to more sophisticated libraries. It has the following key features:
CGI Lite is a lightweight module developed as an extension of the Perl 4 version of cgi-lib.pl with some added functionality. Its use and applications are described in detail in the book "CGI Programming on the World Wide Web, By Shishir Gundavaram, O'Reilly & Associates, 1996."
CGI::* modules are Perl 5 modules that allow you to create and decode forms, debug your CGI programs and maintain state between forms. Lot of this functionality has now been incorporated in CGI.pm.
CGI.pm is a Perl 5 CGI Library that uses objects to create HTML fill-out forms on the fly and to parse their contents. It simplifies the execution of tasks such as, creating HTTP headers, creating HTML form elements, parsing query strings, maintaining the state of fill-out-forms, and more.
First, it is important to note that the use of these libraries is not an "all or nothing" solution to writing CGI scripts. It is context dependent. The CGI modules should generally be used for heavy-duty CGI scripts. For simple scripts, such as for a counter or redirection, it is often easier and quicker to write them from scratch.
Secondly, with different choices available, choosing the appropriate CGI library becomes an issue. For this article, we will restrict ourselves to the description of CGI.pm due to the various advantages it offers:
The advantages that CGI.pm offers can be summarized as follows:
It may also be helpful to go through the comparitive reviews of the libraries, such as cgi-lib.pl vs. CGI.pm.
To use CGI.pm you will need to have access to a WWW server that supports CGI scripts (such as Apache or WebSite) and Perl 5.004 or higher (CGI.pm has not been tested with earlier versions).
The current distribution for Unix, Windows or Macintosh systems can be downloaded from here. There are instructions both for installing as root and otherwise and on other platforms, including Windows NT, Macintosh and VMS.
We describe here version 2.50, which is the latest released version as of this writing.
CGI.pm can be used in two distinct modes: function-oriented and object-oriented. Here is a simple example of a script written in function-oriented as well as using the OOP style:
FUNCTION-ORIENTED | OBJECT-ORIENTED |
---|---|
#!/path_to/perl -w use CGI qw/:standard/; print header(), start_html(-title=>'Greetings'), h1('Greetings'), 'Hello World!', end_html(); |
#!/path_to/perl -w use CGI; $q = new CGI; print $q->header(), $q->start_html(-title=>'Greetings'), $q->h1('Greetings'), 'Hello World!', $q->end_html(); |
A quick examination here (we shall describe the syntax in detail later) reveals the differences between the two approaches. In the function-oriented mode, the use operator loads the CGI.pm definitions and imports the ":standard " set of function definitions. Then calls are made to various functions such as header(), to generate the HTTP header, start_html(), to produce the top part of an HTML document, h1() to produce a level one header, and so on. In the object-oriented mode, you use CGI; without specifying any functions or function sets to import. In this case, you communicate with CGI.pm via a CGI object. The object is created by a call to CGI::new(), and encapsulates all the state information about the current CGI communication, such as values of the CGI parameters passed to your script.
For small scripts, the function-oriented approach is sufficient. Object-oriented programming style is more verbose and is usually recommended for writing large scripts. Object-oriented style, however, offers various benefits:
For these reasons, we will use only the OOP style to illustrate the examples in this article.
In CGI.pm, everything is done through a "CGI object." When you create one of these objects, it examines the environment for a query string, parses it, and stores the results. You can then ask the CGI object to return or modify the query values.
CGI objects handle GET and POST methods. (For backward-compatibility, it can also distinguish between scripts called from <ISINDEX> documents and form-based documents.)
The most basic use of CGI.pm is to get at the query parameters submitted to your script. To create a new CGI object that contains the parameters passed to your script, include the following at the top of your script:
use CGI; $query = new CGI;
This code calls the new() method of the CGI class and stores a new CGI object into the variable named $query. The new() method does the work of parsing the script parameters and environment variables, and stores its results in the new object. The object can now be used to make method calls to carry out useful tasks such as get at the parameters, generate form elements, etc.
An alternative form of the new() method allows you to read script parameters from a previously-opened file handle:
$query = new CGI(FILE_HANDLE)
The file handle can be a series of newline delimited TAG=VALUE pairs. (This is compatible with the save() method. This lets you save the state of a CGI script to a file, and reload it later.)
You can initialize a CGI object from a Perl associative-array reference. Values can be either single- or multiple-valued, as seen in the next example:
$query = new CGI({'name'=>'Einstein', 'interests'=>[qw/WWW CGI Perl/]});
You can initialize a CGI object by passing a URL-style query string to the new() method as follows:
$query = new CGI('foo=bar&size=large');
This can be done by:
@keywords = $query->keywords
This can be done by:
@names = $query->param
If the script was invoked with a parameter list (for example, "name1=value1&name2=value2"), the param() method will return the parameter names as a list.
This can be done by:
@values = $query->param('foo');
or
$value = $query->param('foo');
You pass the param() method a single argument to fetch the value of the named parameter. If the parameter is multiple-valued, you can ask to receive an array. Otherwise the method will return a single value. The array of parameter names returned will be in the same order in which the browser sent them.
Several other named parameters are also recognized. There are also methods (and named arguments) for setting the value(s) of a named parameter, appending a parameter, deleting a named parameter entirely, deleting all parameters, importing parameters into a namespace, and direct access to the parameter list.
You can creating a standard header for a virtual document, as shown in the next example:
print $query->header('image/jpg');
This prints out the required HTTP Content-type: header and the required blank line underneath it. If no parameter is specified, it will default to 'text/html'. You can specify a status code and a message to pass back to the browser:
print $query->header(-type=>'image/jpg', -status=>'204 No Response');
This presents the browser with a status code of 204 (No Response).
CGI.pm provides shortcut methods for many other HTML tags, of which we will discuss a few. (To see the entire list of HTML tags that are supported, look at the functions defined in the %EXPORT_TAGS array in the CGI.pm module file.) Most HTML tags are represented as lowercase function calls. (There are a few exceptions, however, that are either due to conflicts with Perl's functions or names of CGI.pm's own methods.)
Including the following
print $query->start_html(-title=>'Relativity', -meta=>{'author'=>'Albert Einstein', 'description'=>'The General Theory of Relativity'}, -BGCOLOR=>'white');
will return a HTML header information and the opening <body> tag. The use of the parameters used here -title, -meta and -BGCOLOR should be obvious. Using
print $query->end_html
ends an HTML document by printing the </body></html> tags.
You can create an unpaired tags (such as <p>, <hr> or <br>). For example:
print $query->hr;
outputs the text "<hr>". You can create an unpaired tags (such as <em> or <i>). For example:
print $query->em("New!");
outputs the text "<em>New!</em>".
You can pass as many text arguments as you wish, allows you to create nested tags. They will be concatenated together with spaces in between. As an example:
print $query->h1("CGI programming",$query->em("can be"),"fun.");
outputs the text:
<h1>CGI programming <em>can be</em> fun.</h1>
To add attributes to an HTML tag, you can pass a reference to an associative array as the first argument; the keys and values of the associative array become the names and values of the attributes. For example, to generate an anchor link, we could do the following:
print a({-href=>"foo.html"},"The Foo Document");
which outputs:
<a href="foo.html">The Foo Document</a>
Sometimes an HTML tag attribute has no argument. For example, you wish to specify that a table has a border with <TABLE BORDER>. The syntax for this is an argument which points to an undef string. We will see that in the next example.
Since all HTML tags are "distributive", you can take advantage of this to create HTML tables, as shown in the next example:
print table({-border=>undef}, caption(strong('A Singular Matrix with LU Decomposition')), Tr({-align=>LEFT,-valign=>TOP}, [ th(['','Column 1','Column 2']), th('Row 1').td(['0','1']), th('Row 2').td(['1','0']) ] ) );
which will output
<table BORDER="1"> <caption><STRONG>A Singular Matrix with LU Decomposition</STRONG></caption> <tr ALIGN="CENTER" VALIGN="TOP"> <th valign="top" align="left"> </th> <th valign="top" align="left">Column 1</th> <th valign="top" align="left">Column 2</th> </tr> <tr ALIGN="CENTER" VALIGN="TOP"> <th valign="top" align="left">Row 1</th> <td valign="top" align="left">0</td> <td valign="top" align="left">1</td> </tr> <tr ALIGN="CENTER" VALIGN="TOP"> <th valign="top" align="left">Row 2</th> <td valign="top" align="left">1</td> <td valign="top" align="left">0</td> </tr> </table>
When used in conjunction with the import facility, the HTML shortcuts can make CGI scripts easier to read. See the section on importing HTML-tag functions.
There are various form-creating methods, all of which return strings containing the HTML code that will create the requested form element, to the respective calling method. It is upto you to actually print out these strings. You can place formatting tags around the form elements.
The default values that you specify for the forms are only used the first time the script is invoked. If there are already values present in the query string, they are used, even if blank. If you want to change the value of a field from its previous value, you can either:
If you want to reset all fields to their defaults, you can either:
CGI.pm has support for various form elements. We mention here only selected few.
print $query->startform($method,$action,$encoding); . . . form content here . . . print $query->endform;
startform() will return a <form> tag with the optional method, action and form encoding that you specify. endform() returns a </form> tag.
This can be as seen in the next example:
print $query->textfield(-name=>'field_name', -default=>'starting value', -size=>50, -maxlength=>60);
where textfield() will return a text input field. The parameter -name is the required name for the field, the (optional) parameter -default is the starting value for the field contents, the (optional) parameter -size is the size of the field in characters, and the (optional) parameter -maxlength is the maximum number of characters the field will accomodate.
When the form is processed, the value of the text field can be retrieved with:
$value = $query->param('foo');
As a good practice, every form must have a submit button. This is how one can be created:
print $query->submit(-name=>'button_name', -value=>'value');
where submit() will create the query submission button. The parameter -name is optional. A naming scheme can be useful if you have several submission buttons in your form. The second argument -value is also optional and gives the button a value that will be passed to your script in the query string, and will also appear in the browser as a label.
Saving the state of a form is used in various instances, and is the basis of many shopping carts.
The following:
$query->save(FILE_HANDLE)
writes the current query out to the file handle of your choice (such as a file, a socket or a pipe). The file handle must already be open and be writable. The contents of the form are written out as TAG=VALUE pairs that can be reloaded with the new() method later.
url() returns the script's URL in a variety of formats. Called without any arguments, it returns the full form of the URL:
http://host_name/path_to/script.cgi
You can modify this format with the named arguments. For example, if the argument -absolute is true, it produces an absolute URL:
$absolute_url = $query->url(-absolute=>1);
outputs:
/path_to/script.cgi
Many scripts allocate only a single query object, use it to read parameters or to create a fill-out form, and then discard it. For these type of scripts, it can be useful to import standard CGI module methods into the current name space. This is done by the use CGI statement:
use CGI qw(:standard);
Then, instead of getting parameters like this:
use CGI; $g = $query->param('gravitational_constant');
You can do it like this:
use CGI qw(:standard); $g = param('gravitational_constant');
You can also import selected method names:
use CGI qw(header start_html end_html);
Import facility with HTML shortcuts makes CGI scripts easier to read. As described below, tag functions automatically generate both the opening and closing tags. For example:
print h1('Level 1 Header');
outputs:
<h1>Level 1 Header</h1>
To generate the start and end tags yourself, you can use the form start_tag_name and end_tag_name, as in:
print start_h1,'Level 1 Header',end_h1;
With a few exceptions, start_tag_name and end_tag_name functions are not generated automatically via use CGI. However, you can specify the tags you want to generate start/end functions by putting an "*" in front of their name. For example:
use CGI qw/:standard *table/;
generates, in addition to the standard ones, the functions start_table() and end_table(), which generate a <TABLE> tag and a </TABLE> tag, respectively.
There are various other powerful features in CGI.pm which we haven't discussed, in particular, support for Cascading Style Sheets (CSS) Level 1, JavaScript 1.2, HTTP Cookies, and NPH (no parsed header) scripts. CGI.pm can be used with FastCGI, a protocol invented by OpenMarket that speeds up CGI scripts under certain circumstances, and with mod_perl (versions 0.95 and higher), an Apache Web server module that embeds a Perl interpreter into the Web server.
Inspite of numerous advantages and uses of CGI.pm, it has a few limitations.
There are many routine tasks involved when doing CGI programming, which can be greatly simplified by using CGI libraries. The libraries using the object-oriented features of Perl 5 can be particularly useful in writing and maintaining large scripts. Using CGI.pm as a prototype, we attempted to show a glimpse of what they can offer.
TASK | SCRIPT INFORMATION |
---|---|
Save and restore the state of a form to a file | Script in Action, Script Source Code |
Server Push using NPH | Script in Action, Script Source Code |
State persistence using a cookie | Script in Action, Script Source Code |
Side-by-side form and response using HTML frames | Script in Action, Script Source Code |
Verify the contents of a fill-out form with JavaScript | Script in Action, Script Source Code |
CGI Security : Better Safe than Sorry
Creating a Page Counter In Perl
Speed Thrills : CGI Please ... and Fast!
Server-Side Includes and its Extensions
Random and Recursive Crypting using Salt on Unix and Win32
Creating a mailing list using Perl
Reading and Writing to Files on the Server