CGI Security : Better Safe than Sorry
Creating a Page Counter In Perl
Speed Thrills : CGI Please ... and Fast!
CGI Programming Made (Relatively) Easy Using Libraries
Server-Side Includes and its Extensions
Random and Recursive Crypting using Salt on Unix and Win32
You are here: irt.org | Articles | CGI & Perl | Reading and Writing to Files on the Server [ previous next ]
Published on: Sunday 9th August 1998 By: Jason Nugent
In the past few articles, we have taken a look at how data is sent to the server, how it is read into your CGI script, and also how to extract that information from its encoded form. Once we had this data, we showed it back to the user to prove to ourselves that the process worked correctly.
The problem is, however, is that once the CGI script is done executing, our data is lost. Gone. No more. The next time our script was run, it would start off fresh, with no recollection of what had just happened.
Its time to change how this works, it would appear. This article will discuss the mechanisms most often used in Perl scripts to store data out to the filesystem so it can be used again at a later time. Once your data is in a file on the server, other programs may access it, you can manipulate it, graph it, or do just about anything you want with it.
I'm going to assume that you are familiar with the decoding process discussed in the other article. I will assume that you have the data stored in an series of variables, ready to be saved out to the filesystem. Let us begin.
In order to save or read information from a file (or from STDIN, the location where the information comes into your script when POST is used to submit it), you need a filehandle open. A filehandle is basically a pipeline or gateway to a resource outside the script. Typically, these resources are files located on the server. A filehandle is created when you open a file either for reading, writing, or appending. Printing to or reading from the filehandle represents printing to or reading from the file on the server. To create a filehandle, you need to use the open() command.
The structure of the open command is simple, but you can alter the way a file is opened using several modifiers. The basic command is
open (FILEHANDLE, "filename") or die "cannot open file: $!";
The FILEHANDLE is the filehandle that is created during the open process. It is through this filehandle that you will be doing all your communicating with the file you have just opened. The "filename" represents the name of the file you wish to open. The filename can contain complete path information as well:
/home/wwwroot/docs/scripts/myfile.txt
In this example, the entire directory path is entered as part of the filename. The complete path is not necessary for the open() command to work, but I recommend it. It ensures that the file is being created (or read from) the proper directory and allows you to reference files that might not exist in the directory in which the script is being run. You may also store your filename in a variable, and then pass the variable to the open() command as an argument.
$file = '/home/wwwroot/docs/scripts/myfile.txt'; # store the file open (FILE, $file) or die "cannot open $file: $!"; # opens the file
Before I discuss the different ways to open the file, I should point out the section of code after the open() command. The statement
or die "cannot open file: $!"
will terminate your Perl CGI script if the open command is not successful. There is a good reason for this. If the script is not terminated, it will continue to execute and may cause damage to resources the script has access to. The $! variable contains the error message that is generated by the Perl script if it dies. This message will get added to the standard error message in your web server's error log files which will help you figure out what went wrong.
As I mentioned, there are several ways to modify the behavior of the opened file. These involve adding something to the front of the filename. The ones I most typically use are, in no particular order
Because we are working with CGI programs, you must also consider the fact that many copies of your script may be running all at once, each trying to get a hold of your text file for either reading, writing, or updating. Imagine this scenario - a copy of your script opens your text file. During the time that the script is working, a second copy of your script opens the file, makes some changes and then saves the file out to disk again. When the first copy finishes, it will overwrite the changes made by the second copy of your script.
To prevent this from happening, you must lock your file while it is open. Locking a file ensures that no other script may tamper with its contents (or even read the file, if it is locked exclusively) while it is locked. To lock a file, use the flock() command. The flock() command takes two parameters - the type of locking you wish to do, and the FILEHANDLE you wish to lock.
The two versions most commonly used in CGI scripts are
flock(FILEHANDLE, 2) or die "cannot lock file exclusively: $!";
and
flock(FILEHANDLE, 8) or die "cannot unlock file: $!";
I should point out that any locked files are unlocked when the perl script closes the file or finishes executing. However, if you are done using a file and still have quite a bit of processing to do, unlock the file. This way, other instances of your script may have access to it.
So, now we have our script opened, and locked so other scripts cannot access it until we are finished. To read a file into memory (i.e., store it in a variable), we shall make use of the "double diamond" operator, <>. The double diamond operator will read in the complete contents of a FILEHANDLE. In our case, we might want to try something like this.
@my_file = <FILEHANDLE>;
which will read in the contents of the FILEHANDLE we opened with open(), and store each line in a separate cell in our array @my_file. So, the first line of our file could be referenced using $my_file[0], the second as $my_file[1], etc. When there is no more information to read, <> will return EOF and the process will stop.
So, where are we now? Well, we can open a file, lock it, and read its contents. The following snippet of code illustrates the point:
open (FILE, "/home/wwwroot/docs/myfile.txt") or die "cannot open file: $!"; flock(FILE, 2) or die "cannot lock file: $!"; @my_file = <FILE>; # read in the contents of the file
The preceding section of code will read in a file. If we wanted to write to a file, we could do the following:
open (FILE, ">/home/wwwroot/myfile.txt") or die "cannot open file: $!"; flock(FILE, 2) or die "cannot lock file: $!"; print FILE "this is some text\n"; print FILE "this is some more text\n"; close FILE; # also unlocks the file
At this point, there should be a file called "myfile.txt" inside the /home/wwwroot/ directory with the two lines printed to it above. Notice that the print command redirects the output to FILE. By placing the filehandle in the print statement in this manner, you may shuffle information throughout your filesystem.
I should point out that your perl script must be able to write in the directory that you want to create the file in. Since most webservers run as user "nobody", your perl script must have the correct permissions to write in that particular directory. This usually means creating a directory with universal write access, so be sure to create this directory in a location not normally accessible by regular users (i.e., outside the webserver document path). You might also want to think twice about doing this on a server with many regular shell accounts. Each person who can log in to the server can potentially overwrite this file.
So, with this we can take data that might have been sent to our script via a form on a web page and save it out to a file on the server. If, for example, you have a guestbook form on your page you can open a text file on the server using ">>" and append the new information to the bottom of it. You might have a second script that reads this file and displays the contents of it inside a web page. This is the basic concept behind guestbooks on the web.
The following script will take some basic information about a user and save it out to a file on the server. It will also display the PREVIOUS contents of the file to the user once the script is finished running. The next time a user runs the script, he or she will see the information submitted the first time they ran it. The code is commented, so read it to get an explanation of the internals.
#!/usr/local/bin/perl -w ######################################## ## Environmental Variable Tracker ## ## jason@irt.org ## ## Jason Nugent - Copyright 1998 ## ## Feel free to use this source, ## ## as long as this statement is ## ## included in the code. ## ######################################## ## the file location is hidden - fewer security risks ## for IRT.org. To run it on your own system, ## just place the location of the file here $file = 'file location goes here'; # open the file in read/append mode. There is a trick used later # on to make it overwrite the contents of the entire file. Keep # reading to find out what it is. open (ENVFILE, "+>>" . $file) or die "cannot open $file: $!"; flock (ENVFILE, 2) or die "cannot flock $file: $!"; # since append normally adds information to the end of the file, we # must rewind the pointer in the file to the beginning. The first number # represents the byte position in the file. 0 is before the first letter # in our file. The second 0 represents the offset. In this case, 0 and 0 # mean place the pointer at the beginning of our file. seek ENVFILE, 0, 0; my @file_contents = <ENVFILE>; # read the contents into the array # remove all information in our file. This means our append will add text # to an empty file, which is exactly what we want it to do. The 0 represents # the size of the file that we want. We want a completely empty file. truncate ENVFILE, 0; my @old_contents = @file_contents; # duplicate the array so we # aren't overwriting the old stuff # replace the array with information from the new user $file_contents[0] = "Browser was " . $ENV{'HTTP_USER_AGENT'} . "<BR>\n"; $file_contents[1] = "They came from " . $ENV{'HTTP_REFERER'} . "<BR>\n"; $file_contents[2] = "The remote address was " . $ENV{'REMOTE_ADDR'} . "<BR>\n"; $file_contents[3] = "The remote host was " . $ENV{'REMOTE_HOST'} . "<BR>\n"; print ENVFILE @file_contents; # print the new info to the file # the first printable line to our browser must be a content-type header line, # followed by two newline characters print "Content-type: text/html\n\n"; # the next line tells the interpreter to print everything to the browser # and also interpolate variables (like @old_contents) # it will print until it encounters an eof marker on a line by itself print <<eof; <HTML> <HEAD> <TITLE>Previous client information</TITLE> </HEAD> <BODY BGCOLOR="#FFFFFF"> <P> eof # we insert some perl code here to make the script print out each line of our # information array. We also tack on a \n which is a perl newline. Makes our HTML # easier to read. foreach $line (@old_contents) { print $line . "\n"; } print <<eof; </P> <P> <FORM ACTION="http://tuweb.ucis.dal.ca/cgi-bin/jnugent/client.pl" METHOD="POST"> <INPUT TYPE="submit" VALUE="Click to View Previous Client"> </FORM> </P> </BODY> </HTML> eof # this causes the perl script to stop printing to the client close (ENVFILE);
Now that you've read and understood the code you can try it our for yourself: readwrt.htm
CGI Security : Better Safe than Sorry
Creating a Page Counter In Perl
Speed Thrills : CGI Please ... and Fast!
CGI Programming Made (Relatively) Easy Using Libraries
Server-Side Includes and its Extensions
Random and Recursive Crypting using Salt on Unix and Win32
Creating a mailing list using Perl
Server Side Includes and CGI Security