The Ruby gem rmail
has methods to parse a mailbox file on local disk. Unfortunately this gem has broken (in Ruby 2.0.0). It might not get fixed, because folks are migrating to the gem mail
.
Gem mail
has method Mail.read('filename.txt')
, but that parses only the first message in a mailbox.
That gem, and builtin Net::IMAP
, have flooded the net with tutorials on accessing mailboxes through imap.
So, is there still a way to parse a plain old file, without imap? As the lone rubyist in my group I'd rather not embarrass myself by resorting to http://docs.python.org/2/library/mailbox.html.
Or, worse yet, PHP's imap_open('/var/mail/www-data', ...)
-- if only Net::IMAP.new
accepted filenames like that.
The good news is the Mbox format is really dead simple, though it's simplicity is why it was eventually replaced. Parsing a large mailbox file to extract a single message is not specially efficient.
If you can split apart the mailbox file into separate strings, you can pass these strings to the Mail library for parsing.
An example starting point:
def parse_message(message)
Mail.new(message)
do_other_stuff!
end
message = nil
while (line = STDIN.gets)
if (line.match(/\AFrom /))
parse_message(message) if (message)
message = ''
else
message << line.sub(/^\>From/, 'From')
end
end
The key is that each message starts with "From "
where the space after it is key. Headers will be defined as From:
and any line that starts with ">From"
is to be treated as
actually being "From"
. It's things like this that make this encoding method really inadequate, but if Maildir isn't an option, this is what you've got to do.