I'm attempting to convert a bit of HTML to a PDF document with IronPDF EAP 2021.6.3135. After creating a new ChromePdfRenderer, I call RenderHtmlAsPdfAsync on it, passing the HTML string as the only argument. The HTML is a single <div>
with several nested <div>
s, one of which contains CJK text. IronPDF appears to interpret that text as either ASCII or UTF-8; in any case, it renders it as nonsense. This works properly—without the workaround mentioned below—with the current release of IronPDF (2021.3.1).
Inserting a byte-order mark (\uFEFF
) at the beginning of the string fixes the problem, but I shouldn't need to do that. Is there a new setting/option in the EAP branch's API that I've overlooked? Or is this a known issue that will get addressed before release?
Chrome encoding autodetection fails with very long html strings.
It is recommended to include:
<meta charset="utf-16"/>
at the beginning of any HTML file which contains utf-16 characters. (This is a reasonable request because ultimately it is difficult to determine the desired decoding).
Iron Software is reviewing the possibility of IronPDF automatically defaulting to utf-16 encoding if no other encoding is specified, to help alleviate these kinds of issues.