Checking encryption status of email.

I've been working with gpg-mailgate recently and have it working almost how I want it.

My last little hurdle is how do reliably check to see if an incoming email is already pgp encrypted. Here are the options I see.

Add a content filter in postfix that uses Mail::gpg.is_encrypted and if the message returns yes then send directly on. If not then I can send it to gpg-mailgate to deal with it.
Call a small perl script from within gpg-mailgate and once again call Mail::GPG.is_encrypted and go from there. I saw a few examples of calling perl from within python and I'd prefer to do it this way.
I can unreliably test for encryption by looking for **BEGIN PGP MESSAGE** but that is in no way a great solution.

That's all I came up with. Since there is nothing in the gnupg python wrapper to test for this I think I'm going to have to look at calling something else.

What I'm worried about in the first 2 scenarios is performance. I'm trying to stay from as much unnecessary load as possible.

I'm open to all suggestions, thanks.

Solution

A GPG or other OpenPGP message sent by email should be sent in RFC 2015 format. Which is very simple to detect, reliably and efficiently.

Basically, you're just checking that the RFC822 headers have a Content-Type that looks like this:

Content-Type: multipart/encrypted; boundary=whatever; protocol="application/pgp-encrypted"

The ideal way to do this is with the email and mime modules in Python, or your favorite equivalents from CPAN in perl, but you can probably do it faster with a well-crafted regexp. (If you choose to go this way, look for a pre-tested regexps for RFC822 headers and build on that, because you will get parts of RFC822 wrong the first 20 times you do it. Continuation lines, optional whitespace all over the place, variable order of components, case insensitivity, etc.)

But what if someone just sent a raw OpenPGP message as the text of an email? At least some MUAs will detect that and treat it the same as a proper OpenPGP message. How do they do that? Basically, the way you suggested: by scanning the body.

The first part of that is to get the body itself. If they're just using a plain-text mailer, or a fancy mailer that's not stupid, it'll just be the body of a non-multipart RFC 822 message, which is easy. But if you want to handle mailers that like to HTMLify everything, you'll need to find the text/plain part, before you can even scan. Doing that reliably or quickly isn't too bad, but both?

Now, how do you tell if a message body is OpenPGP? RFC 4880 describes the format. In particular, look at section 6.2 about the ASCII armor format.

But briefly: Look for an armor header line, -----BEGIN PGP MESSAGE-----, then zero or more RFC-822 headers, then a blank line, then a block of data that's all base-64 characters and whitespace, then the armor tail line -----END PGP MESSAGE-----. You may want to allow arbitrary lines before and after the header and tail for safety (it doesn't make the scan any harder). Anything that matches that is very, very likely to be an OpenPGP message; anything that doesn't is unlikely to be usable with GPG.

This still won't handle every possible case that people might use. If someone sends an OpenPGP multi-part message split across two emails, drags and drops an OpenPGP message in a way that makes their mailer turn it into an attachment instead of the body, pastes a binary OpenPGP message instead of an ASCII-armored one, …

But I suspect handling just proper OpenPGP MIME, and raw OpenPGP ASCII armor in the plaintext body, should be enough.