Search code examples
emailsmtpimaprfclinefeed

Why does an email subject contain linefeed or carriage return characters?


I'm making a code to check a mailbox, and forward unseen mails to another user.
But sometimes it fails with an error:

ValueError: Header values may not contain linefeed or carriage return characters

I checked the raw fetched data and found out that the 'Subject' value contains \r\n.
Not all mails contain, but some do.
It just appears normal in the mailbox, and I have no idea why some contain such characters.
Does it have to do with the length of the subject?
How can I deal with these situations?
Thanks :)


Solution

  • Email messages have a maximum line length. That's historical and the rule isn't upheld 100% of the time, so to speak. But in header fields, a space is to be treated the same as a CR LF and a sequence of spaces or a htab character. This is a really long subject, encoded in that way:

    Subject: Pretend this is about 80-90
      characters long
    

    The simplest way to deal with it is to consider any sequences of space characters to be a single space.

    Read the source of any email message, you'll see this wrapping in most of then. The Received fields is almost always wrapped, for instance, and quite often To if there are many addressees, or Content-Type/Content-Disposition for attachments.