The Perl Email Project

Given that (a) Perl is optimized for text processing and Unix-y administrative tasks; and (b) the mbox email storage format is both plain text and deeply rooted in Unix culture — you’d think Perl would be a terrific scripting language for mbox parsing.

But I’ve always rather disliked the entire Mail:: hierarchy of CPAN modules, several of which provide ways to parse mbox files (along with other mail storage formats). They reek of over-engineering — complicated APIs and slow performance. One of Larry Wall’s guiding precepts for Perl is, “Easy things should be easy, and hard things should be possible.” The modules in the Mail:: hierarchy fail the “easy things should be easy” test.

In response, Simon Cozens and several collaborators have spearheaded the Perl Email Project, the product of which is the relatively new Email:: hierarchy of CPAN modules. These modules are simple, fast, and easy to use. I replaced Mail::MboxParser with Email::Folder in a script I wrote to process mbox files full of T-shirt orders, and it ran over 10 times faster. Even better, I found the syntax more intuitive — more, well, Perlish.

For example, in my original script using Mail::MboxParser, given a message object $msg, to get the body of $msg as a string, I needed to do this:

my $body = $msg->body($msg->find_body);
my $text = $body->as_string;

Whereas using Email::Folder (and Email::Simple), I can just write:

my $text = $msg->body;

A few weeks ago Cozens wrote an article for Perl.com, “The Evolution of Perl Email Handling”, wherein he makes a compelling case for the new Email:: modules while providing a good introduction to their usage. Highly recommended for anyone who uses Perl to read email files.