Home Articles FAQs XREF Games Software Instant Books BBS About FOLDOC RFCs Feedback Sitemap
irt.Org

Related items

CGI Security : Better Safe than Sorry

Creating a Page Counter In Perl

Speed Thrills : CGI Please ... and Fast!

CGI Programming Made (Relatively) Easy Using Libraries

Server-Side Includes and its Extensions

Random and Recursive Crypting using Salt on Unix and Win32

Timestamping an HTML Document

Deleting Files in Perl

Creating a mailing list using Perl

Server Side Includes and CGI Security

Reading and Writing to Files on the Server

You are here: irt.org | Articles | CGI & Perl | Reading and Writing to Files on the Server [ previous next ]

Published on: Sunday 9th August 1998 By: Jason Nugent

Introduction

In the past few articles, we have taken a look at how data is sent to the server, how it is read into your CGI script, and also how to extract that information from its encoded form. Once we had this data, we showed it back to the user to prove to ourselves that the process worked correctly.

The problem is, however, is that once the CGI script is done executing, our data is lost. Gone. No more. The next time our script was run, it would start off fresh, with no recollection of what had just happened.

Its time to change how this works, it would appear. This article will discuss the mechanisms most often used in Perl scripts to store data out to the filesystem so it can be used again at a later time. Once your data is in a file on the server, other programs may access it, you can manipulate it, graph it, or do just about anything you want with it.

I'm going to assume that you are familiar with the decoding process discussed in the other article. I will assume that you have the data stored in an series of variables, ready to be saved out to the filesystem. Let us begin.

File Handles

In order to save or read information from a file (or from STDIN, the location where the information comes into your script when POST is used to submit it), you need a filehandle open. A filehandle is basically a pipeline or gateway to a resource outside the script. Typically, these resources are files located on the server. A filehandle is created when you open a file either for reading, writing, or appending. Printing to or reading from the filehandle represents printing to or reading from the file on the server. To create a filehandle, you need to use the open() command.

The structure of the open command is simple, but you can alter the way a file is opened using several modifiers. The basic command is

open (FILEHANDLE, "filename") or die "cannot open file: $!";

The FILEHANDLE is the filehandle that is created during the open process. It is through this filehandle that you will be doing all your communicating with the file you have just opened. The "filename" represents the name of the file you wish to open. The filename can contain complete path information as well:

/home/wwwroot/docs/scripts/myfile.txt

In this example, the entire directory path is entered as part of the filename. The complete path is not necessary for the open() command to work, but I recommend it. It ensures that the file is being created (or read from) the proper directory and allows you to reference files that might not exist in the directory in which the script is being run. You may also store your filename in a variable, and then pass the variable to the open() command as an argument.

$file = '/home/wwwroot/docs/scripts/myfile.txt';   # store the file
open (FILE, $file) or die "cannot open $file: $!"; # opens the file

Before I discuss the different ways to open the file, I should point out the section of code after the open() command. The statement

or die "cannot open file: $!" 

will terminate your Perl CGI script if the open command is not successful. There is a good reason for this. If the script is not terminated, it will continue to execute and may cause damage to resources the script has access to. The $! variable contains the error message that is generated by the Perl script if it dies. This message will get added to the standard error message in your web server's error log files which will help you figure out what went wrong.

As I mentioned, there are several ways to modify the behavior of the opened file. These involve adding something to the front of the filename. The ones I most typically use are, in no particular order

  • > - opens the file in write mode, creating it if necessary (completely overwrites the file if it currently exists). You may not read information from the file in this manner.

  • < - opens the file in read mode. You may NOT write to the file while it is opened in this manner. This is the default method for opening a file. If no argument is given, it defaults to read mode.

  • >> - opens the file in append mode. This will not create the file if it does not currently exist, and will add any output printed to the bottom of the file (by default).

  • +>> - opens the file in read/append mode. This mode requires that the file already be created. Any outputted text is added to the end of the file (by default).

  • +> - this opens the file in read/write mode. This mode is more often used to create a file that will be written to and read from many times during the life of the script - when the file is first opened, any contents of the file are lost. If you wish to open the file, read the contents, and then completely overwrite the file again you need something a bit more special.

Locking a File

Because we are working with CGI programs, you must also consider the fact that many copies of your script may be running all at once, each trying to get a hold of your text file for either reading, writing, or updating. Imagine this scenario - a copy of your script opens your text file. During the time that the script is working, a second copy of your script opens the file, makes some changes and then saves the file out to disk again. When the first copy finishes, it will overwrite the changes made by the second copy of your script.

To prevent this from happening, you must lock your file while it is open. Locking a file ensures that no other script may tamper with its contents (or even read the file, if it is locked exclusively) while it is locked. To lock a file, use the flock() command. The flock() command takes two parameters - the type of locking you wish to do, and the FILEHANDLE you wish to lock.

The two versions most commonly used in CGI scripts are

flock(FILEHANDLE, 2) or die "cannot lock file exclusively: $!";

and

flock(FILEHANDLE, 8) or die "cannot unlock file: $!";

I should point out that any locked files are unlocked when the perl script closes the file or finishes executing. However, if you are done using a file and still have quite a bit of processing to do, unlock the file. This way, other instances of your script may have access to it.

Reading in the File Contents

So, now we have our script opened, and locked so other scripts cannot access it until we are finished. To read a file into memory (i.e., store it in a variable), we shall make use of the "double diamond" operator, <>. The double diamond operator will read in the complete contents of a FILEHANDLE. In our case, we might want to try something like this.

@my_file = <FILEHANDLE>;

which will read in the contents of the FILEHANDLE we opened with open(), and store each line in a separate cell in our array @my_file. So, the first line of our file could be referenced using $my_file[0], the second as $my_file[1], etc. When there is no more information to read, <> will return EOF and the process will stop.

So, where are we now? Well, we can open a file, lock it, and read its contents. The following snippet of code illustrates the point:

open (FILE, "/home/wwwroot/docs/myfile.txt") or die "cannot open file: $!";
flock(FILE, 2) or die "cannot lock file: $!";
 
@my_file = <FILE>;  # read in the contents of the file

The preceding section of code will read in a file. If we wanted to write to a file, we could do the following:

open (FILE, ">/home/wwwroot/myfile.txt") or die "cannot open file: $!";
flock(FILE, 2) or die "cannot lock file: $!";

print FILE "this is some text\n";
print FILE "this is some more text\n";

close FILE;  # also unlocks the file

At this point, there should be a file called "myfile.txt" inside the /home/wwwroot/ directory with the two lines printed to it above. Notice that the print command redirects the output to FILE. By placing the filehandle in the print statement in this manner, you may shuffle information throughout your filesystem.

I should point out that your perl script must be able to write in the directory that you want to create the file in. Since most webservers run as user "nobody", your perl script must have the correct permissions to write in that particular directory. This usually means creating a directory with universal write access, so be sure to create this directory in a location not normally accessible by regular users (i.e., outside the webserver document path). You might also want to think twice about doing this on a server with many regular shell accounts. Each person who can log in to the server can potentially overwrite this file.

So, with this we can take data that might have been sent to our script via a form on a web page and save it out to a file on the server. If, for example, you have a guestbook form on your page you can open a text file on the server using ">>" and append the new information to the bottom of it. You might have a second script that reads this file and displays the contents of it inside a web page. This is the basic concept behind guestbooks on the web.

Working Example

The following script will take some basic information about a user and save it out to a file on the server. It will also display the PREVIOUS contents of the file to the user once the script is finished running. The next time a user runs the script, he or she will see the information submitted the first time they ran it. The code is commented, so read it to get an explanation of the internals.

#!/usr/local/bin/perl -w

########################################
##   Environmental Variable Tracker   ##
##      jason@irt.org                 ##
##   Jason Nugent - Copyright 1998    ##
##   Feel free to use this source,    ##
##   as long as this statement is     ##
##   included in the code.            ##
########################################


## the file location is hidden - fewer security risks
## for IRT.org.  To run it on your own system,
## just place the location of the file here

$file = 'file location goes here';


# open the file in read/append mode.  There is a trick used later
# on to make it overwrite the contents of the entire file.  Keep 
# reading to find out what it is.

open (ENVFILE, "+>>" . $file) or die "cannot open $file: $!";      
flock (ENVFILE, 2) or die "cannot flock $file: $!";


# since append normally adds information to the end of the file, we
# must rewind the pointer in the file to the beginning.  The first number
# represents the byte position in the file.  0 is before the first letter
# in our file.  The second 0 represents the offset.  In this case, 0 and 0
# mean place the pointer at the beginning of our file.

seek ENVFILE, 0, 0;

my @file_contents = <ENVFILE>;  # read the contents into the array

# remove all information in our file.  This means our append will add text
# to an empty file, which is exactly what we want it to do.  The 0 represents
# the size of the file that we want.  We want a completely empty file.

truncate ENVFILE, 0;

my @old_contents = @file_contents; # duplicate the array so we
                                   # aren't overwriting the old stuff

# replace the array with information from the new user

$file_contents[0] = "Browser was " . $ENV{'HTTP_USER_AGENT'} . "<BR>\n";
$file_contents[1] = "They came from " . $ENV{'HTTP_REFERER'} . "<BR>\n";
$file_contents[2] = "The remote address was " . $ENV{'REMOTE_ADDR'} . "<BR>\n";
$file_contents[3] = "The remote host was " . $ENV{'REMOTE_HOST'} . "<BR>\n";

print ENVFILE @file_contents;  # print the new info to the file   

# the first printable line to our browser must be a content-type header line,
# followed by two newline characters

print "Content-type: text/html\n\n";

# the next line tells the interpreter to print everything to the browser
# and also interpolate variables (like @old_contents)
# it will print until it encounters an eof marker on a line by itself

print <<eof;

<HTML>
<HEAD>                          
<TITLE>Previous client information</TITLE>
</HEAD>
 <BODY BGCOLOR="#FFFFFF">

<P>
eof

# we insert some perl code here to make the script print out each line of our 
# information array.  We also tack on a \n which is a perl newline.  Makes our HTML
# easier to read.

foreach $line (@old_contents) {

   print $line . "\n";                  
}

print <<eof;

 </P>
 <P>

 <FORM ACTION="http://tuweb.ucis.dal.ca/cgi-bin/jnugent/client.pl" METHOD="POST">
 <INPUT TYPE="submit" VALUE="Click to View Previous Client">   
 </FORM>

 </P>
 </BODY>
</HTML>

eof  # this causes the perl script to stop printing to the client

close (ENVFILE);

Now that you've read and understood the code you can try it our for yourself: readwrt.htm

Related items

CGI Security : Better Safe than Sorry

Creating a Page Counter In Perl

Speed Thrills : CGI Please ... and Fast!

CGI Programming Made (Relatively) Easy Using Libraries

Server-Side Includes and its Extensions

Random and Recursive Crypting using Salt on Unix and Win32

Timestamping an HTML Document

Deleting Files in Perl

Creating a mailing list using Perl

Server Side Includes and CGI Security

©2018 Martin Webb