Search code examples
htmlbrowserweb-standardsline-breaks

Do browsers send "\r\n" or "\n" or does it depend on the browser?


This question has bothered me for a million years... whenever I create a website with a textarea that allows multi-line (such as a "Bio" for a user's profile) I always end up writing the following paranoid code:

// C# code sample...
bio = bio.Replace("\r\n", "\n").Replace("\r", "\n");
bio = Regex.Replace(@"\n{2,}", "\n\n");

So, what do browsers send up for a <textarea name="Bio"></textarea> if it has multiple lines?


Solution

  • The HTTP and MIME specs specify that header lines must end with \r\n, but they aren't clear (some would argue that it isn't clear if they are clear) about what to do with the contents of a TEXTAREA. (See, for instance, this thread from an HTML working group about the issue.)

    Here's a quote from the HTTP/1.1 spec about message headers:

    The line terminator for message-header fields is the sequence CRLF. However, we recommend that applications, when parsing such headers, recognize a single LF as a line terminator and ignore the leading CR.

    I think that is a good strategy in general: be strict about what you produce but liberal in what you accept. You should assume that you will receive all sorts of line terminators. (Note that in addition to CRLF and LF, Mac OS-9 used CR alone, and there are still a few of those around. The Unicode standard (section 5.8) specifies a wide range of character sequences that should be recognized as line terminators; there's a list of them here.)