Search code examples
asp.netiisutf-8globalization.net-4.8

Why might there be a UTF-8 BOM at the end (as the last bytes) of each IIS response?


At the end of every page that runs through the IIS pipeline that emits a Content-Type with charset (per the <globalization> element), the UTF-8 BOM is added at the end of the response. This is invalid and breaks UpdatePanel usages.

If the BOM is added, it should be the first character..

curl | xxd shows the last three bytes as EF BB BF, or the UTF-8 BOM.

0002b830: 2f62 6f64 793e 0d0a 3c2f 6874 6d6c 3eef  /body>..</html>.
0002b840: bbbf                                     ..

Notes:

  • This issue does not plague the responses that come via the Static File handler. This indicates the issue is within the managed / ASP.NET pipeline.

  • This issue still occurs if calling Request.End() immediately after writing some dummy data. This should imply that the the issue is not caused by any further request handler or soft-wrapper, as it results in ThreadAbortException propagation.

  • MVC and ASP.NET WebForm pages are affected in the same way.

  • ASHX requests are not affected.

  • The issue started, or at least became very apparent, when changing to use <globalization responseEncoding="utf-8" ..>.

  • Such a BOM / character does indeed not appear at the end of the source files used in the requests, much less in every single file affected.. (the codebase has no non-starting/BOM occurrences of \xEF\xBB\xBF, nor any occurrences of \xFE\xFF or \xFF\xFE.)

All IIS modules that do not ship with IIS or ASP.NET MVC have already been removed and the issue persists.

What may be the issue, and what are the next steps in troubleshooting?


Solution

  • The issue is entirely ASP.NET-pipeline related, due to some awesome "handlers" that were added to both the MVC Pipeline (via Filter registered in Global.asax) and via WebForms (added via a base page..); both forms ended up re-assigning HttpContext.Response.Filter to a special delegate .. or in this case, multiple wrapped delegates.

    These "handlers" effectively resulted in writing one stream's content, followed by the empty-except-for-BOM response from another "handler", and the effect was similar to:

     _stream.Write(someOutputBuffer, ..);
     _stream.Write(originalBuffer, ..);
    

    So if the originalBuffer happened to be empty-except-for-BOM the output would have the BOM appended at the "end", as oppose to an equally-invalid location somewhere in the middle of the response..

    Switching to the UTF-8 response encoding caused the behavior because each of these "handlers" opened up a stream with the current response encoding, which had switched to UTF-8, and the default UTF-8 encoding will emit a BOM by default.

    The fix was to create the UTF-8 encoding used in a way that specifies not to emit the BOM by default, instead of using the Response.ContentEncoding (UTF-8 with BOM) directly. (The code is entirely suspect, although that’s a task for another day..)

    tldr; janky "handlers" aren't always installed as IIS modules/handlers.