CGI Security : Better Safe than Sorry
Creating a Page Counter In Perl
Speed Thrills : CGI Please ... and Fast!
CGI Programming Made (Relatively) Easy Using Libraries
Server-Side Includes and its Extensions
Random and Recursive Crypting using Salt on Unix and Win32
You are here: irt.org | Articles | CGI & Perl | Creating a mailing list using Perl [ previous next ]
Published on: Sunday 25th October 1998 By: Jason Nugent
In the last article, we examined how it was possible to read and write to a file stored on the server. I mentioned that this was the basis behind saving state on a server and that numerous common activities on the Internet involve using files on the server in order to work. These included guestbooks, feedback forms, and many others.
This article will take a look at one of the more common activities on the web - a mailing list. Many websites (including irt.org) use mailing lists to let visitors know when the site is updated, removing the uncessary burden of having to return day after day only to be greeted with old content. These mailing lists typically have some sort of front end that lets a user enter his or her email address. A CGI script then accepts this address and adds it to either a text file or database on the server.
Usually, when a second script is run, the text file is opened, each email address is extracted, and then a predetermined email message (usually stored in a second text file) is sent to each one in turn.
Shall we begin?
Generally, this can be quite simple. All you really need is a small form that has a single text field which accepts an email address. You may keep the form simple, or you can go off and read my other article on form field validation using regular expressions and JavaScript 1.2 to find out how to ensure that only valid (syntax-wise) addresses are sent to the server. Something like this should do quite nicely. You don't even need a submit button since a single field form will submit when you press the enter key.
<FORM ACTION="/cgi-bin/email.pl" METHOD="POST"> Email Address: <INPUT TYPE="text" NAME="address"> </FORM>
Now, what you need is a CGI script on the server to accept the email address and store it in the text file. To make things a bit easier (and to keep this article to a reasonable length), I am going to assume that you have read my other article that details how to extract information from the server. Remember, since we are using a POST method here, it will enter into the CGI script on STDIN.
So, assume that we have our email address stored in the variable $email. The following snippet of code will open a text file located on the server:
my $file = '/home/http/htdocs/email_file'; # assigns the file # location to the $file # variable. # opens the file in append mode open (FILE, ">>" . $file) or die "cannot open file for appending: $!"; # we want to make sure nothing overwrites our changes flock (FILE, 2) or die "cannot lock file exclusively: $!";
Remember that we can use the $! variable as a window into what exactly what went wrong with our script. It will contain a more detailed error message than what Perl may typically give you by default when a script fails.
Now that we have our file open, we can add our new address to the end of it. Since, by default, append mode adds to the end of a file, this is rather easy. A simple print statement will take care of it for us.
print FILE $email . "\n"; # we concatenate a newline to the end of # our email address, so the next time we # start off on a fresh line in the file.
Notice that we use the same FILE filehandle that we used when we opened the file originally.
When we close our file, it is saved automatically.
close (FILE) or die "cannot close file: $!"; # we close the # filehandle of the file
So, where are we now? Well, at this point we have taken the email address from the user, opened a text file on the server, and then added the new email address to the bottom of the file. This is the process that will occur each time someone fills in the form on your site and submits it. Your CGI script will run each time and soon you will have a text file containing hundreds (or maybe even thousands!) of email addresses.
Subscribing someone to a mailing list is all fine and dandy, but there will come a time when people who have originally subscribed to your mailing list no longer want to receive mail from you. In a lot of cases, you may only have a few subscribers so this isn't really much of a problem - just edit the file by hand and remove the email address of the person who wants to be taken off. Of course, doing it by hand means two things. First, it means that the user has to actually send you an email asking to be taken off. Second, it means that it is one more responsibility that you have to take care of.
Fortunately, it is relatively easy to implement a feature where users can unsubscribe themselves. Just place a form on your site that takes their email address and removes it from the file on the server by calling another CGI script. What would our form look like? Pretty much exactly like the one for subscribing a user:
<FORM ACTION="/cgi-bin/removeme.pl" METHOD="POST"> Email Address: <INPUT TYPE="text" NAME="address"> </FORM>
Notice that it calls a different CGI script. It is possible to put both the subscription and unsubscription features into the same script, but I find that making a program do more than one thing tends to make code confusing. Its often better to break things up into several smaller programs. This goes double for object oriented languages like Java.
So, what does our backend have to do in this case? Well, it has to open the email address file, find the email address, and remove it. So, let's begin with the open file part once more.
my $file = '/home/http/htdocs/email_file'; # assigns the file # location to the $file # variable. # opens the file in append mode - we have to use a trick here again, and I mentioned it # in a previous article. See if you can remember it before I use it :) open (FILE, ">>" . $file) or die "cannot open file for appending: $!"; # we want to make sure nothing overwrites our changes flock (FILE, 2) or die "cannot lock file exclusively: $!";
We now have to read our email file into memory. We do this (as before) by using the <> operator.
my @emailfile = <FILE>; # read it into @emailfile
Here comes the trick - we have read our email file into memory, but we have to parse it and print everything back to the file EXCEPT the email address we are removing. For this, we have seek the beginning of the file and then truncate the file to zero size.
seek FILE, 0, 0; # move to the beginning of the file and then truncate FILE, 0; # remove all content from it
We now have to write the contents of our array back to the file, except the one line containing the email address we are removing. To do this, we need to use a regular expression. I love regular expressions. We can cycle through the file using a foreach loop:
foreach $line (@emailfile) { if ($line !~ /^\Q$email/o) { print FILE $line; # we don't need a newline, since the file had them when we read it into memory. } }
Let me explain how the regular expression worked. the !~ operator tests for a string that DOES NOT contain the regular expression. We want to print all lines that DON'T contain our email address, so in this case only the addresses that don't contain the address make it inside the if statement. The regular expression itself is pretty simple - /^\Q$email/o.
Step by step - the first ^ means that the email address is to be anchored to the front of the line in the file. This is really necessary but it does also speed up the regular expression somewhat since if it doesn't immediately see that there is match at the front it will fail and move onto the next one. The next \Q quotes metacharacters in the rest of the regex (or at least until a \E is found). This is done to prevent special characters like . from being misconstrued as part of the regex. The $email part is just the email address that you parsed out of the CGI script when the form was submitted. Perl is smart enough to do variable substitution inside of regular expressions so you can do this without problems. The final o (the letter o, not the number zero) tells Perl to optimize the regular expression and to not recompile it each time it runs through. In a lot of cases, you can do this since the variable $email isn't going to change. Normally (without the /o) Perl has to re-evaluate the variable each time which can slow you down a bit. If you're sure the variable isn't going to change, add the o.
All that remains is to close the file. Doing so automatically saves the file.
close (FILE) or die "cannot close message file: $!";
At this point, you might want to print something back to the user telling them that they were unsubscribed to the list. Features to add? Well, maybe you want to check for duplicate email addresses. This code will unsubscribe the user even if his or her email address is entered more than once (the regular expression will catch all duplicates, remember). You might want to prevent people from subscribing more than once to begin with, though. The same code above could easily be adapted to do that, so I'm not going to mention how to do it right now since this article is long enough as it is. For the future, maybe?
All these people are now expecting email from you, however. My first bit of advice is to finally sit down and get that newsletter written. You're going to need it soon. Once you have it written, save it off in a separate text file on your server somewhere.
Essentially, what you would like to do now is have a perl script that will open your list of email addresses and send a copy of your newsletter to every single one of them. Where do we start, then? Well, we already know how to open a file on the server using the open() command. The first thing we have to do is open our list of email addresses.
open(EMAIL, "/home/http/htdocs/email_file") or die "cannot open email: $!";
The above line will open your file containing all your email addresses in read-only mode. We are using read-only mode since we won't be modifying the contents of the file, just reading them. The next thing to do is use the flock() command to ensure that nothing happens to the file while you are working with it.
flock(EMAIL, 2) or die "cannot lock email for exclusive access: $!";
You also need your newsletter at this point. The Perl script must have it open in order to read it and insert it into each email that you send. To open it, use the same procedure as above.
open(MESSAGE, "/home/http/htdocs/msg") or die "cannot open message: $!"; flock(MESSAGE, 2) or die "cannot lock message exclusively: $!";
Now, all that remains is to read your message into an array and send email to everyone on the list. As we saw in past articles, Perl will let you read a file into an array using the <> operator. The <> operator returns a single line at a time (for the most part), until it encounters the end of the file. The following line of code will read the message file into an array:
my @msgfile = <MESSAGE>; # read it into @msgfile
Notice that we pass the <> operator the filehandle we wish to use. Now that our message is in memory, we no longer need our message file open and locked by the perl script. You should always free resources when you no longer need them. This is a safe programming technique that is especially important when doing things that are very memory intensive (like reading files, for example). The following line of code will close your filehandle to the message file and free the resource.
close (MESSAGE) or die "cannot close message file: $!";
We will also have to read our email file into memory so we can cycle through each email address in the list. We do this in the way:
my @email_list = <EMAIL>;
and then close the filehandle since we no longer need it:
close (EMAIL) or die "cannot close email list: $!";
All we need to do now is cycle through each email address and send the message to them. We do this using a foreach loop. As mentioned in previous articles, the foreach loop cycles through each item in a list and performs the same section of code each time. We could cycle through our email list like this:
foreach $address (@email_list) { # code goes in here. }
But, what code do we use? How can perl send email? Does it have a built in email program? Well, no, not exactly. But Unix usually has one. Most Unix flavours come with a program that is used to send and receive email. You probably have an account that is serviced by a Unix machine using this program. The program is called sendmail, and is one of the staples of Unix. And yes, Perl has access to it.
The way you call sendmail from Perl is by using the open() command again, but this time the argument is a bit different. You are going to use what is known as a pipeline in order to pipe your perl output to the sendmail program. It's done like this:
open (SENDMAIL, "|/usr/sbin/sendmail -t") or die "cannot open sendmail: $!"; print SENDMAIL "To: $email_address_goes_here\n";
As you can see, all you need to do is replace the email_address_goes_here with the email address you want to send mail to. Quite simple. I
should also mention that the location for sendmail on your system might be different. You can find it by typing: which sendmail
at the Unix prompt.
This is much safer than passing the email address as a direct parameter to sendmail. Doing so could introduce security bugs and is not recommended.
So, our code now looks like this:
foreach $address (@email_list) { chomp($address); # chomp removes the newline from our address, # since one was present in our # email address file. open (SENDMAIL, "|/usr/sbin/sendmail -t") or die "cannot open sendmail: $!"; print SENDMAIL "To: $address\n"; close (SENDMAIL); # closing sendmail sends our email message. }
There is still a bit more to do. We have to tell sendmail that we want to print our message in each email, and we probably also want to fill in some default information about ourselves. To accomplish this, we have to add a few lines of code.
foreach $address (@email_list) { chomp($address); open (SENDMAIL, "|/usr/sbin/sendmail -t") or die "cannot open sendmail: $!"; print SENDMAIL "To: $address\n"; print SENDMAIL "From: jason\@irt.org\n"; # we have to escape the @ so perl # doesn't think it's a variable. print SENDMAIL "Subject: Weekly Newsletter\n\n"; # use a double newline to tell # sendmail that the # headers are done print SENDMAIL @msgfile; # include our newsletter }
Really, that's all it takes. A simple Perl application (this isn't a CGI script) will take the list of email addresses and send the newsletter to every one of them. This sort of thing really automates life for webmasters, especially if their site is a busy one.
I'm just going to say this once - Spam is wrong. It's annoying, costs money, and achives no real goal other than getting people frustrated. Chain letters are illegal in a lot of places. It's really your decision, but I would recommend that you think twice before doing it. The internet will be a better place for it.
There isn't a full example for this article, because I think that the code mentioned in here, coupled with the code examples from previous articles, should be more than enough to get a mailing list up and running fairly quickly. If you have any questions, let me know.
CGI Security : Better Safe than Sorry
Creating a Page Counter In Perl
Speed Thrills : CGI Please ... and Fast!
CGI Programming Made (Relatively) Easy Using Libraries
Server-Side Includes and its Extensions
Random and Recursive Crypting using Salt on Unix and Win32
Reading and Writing to Files on the Server
Server Side Includes and CGI Security