How is plain text modified when set through innerHTML?

When setting innerHTML = '\r\n', it seems like browsers may end up writing '\n'.
This introduces a gap between the actual plain text content of the element and what I have been keeping track of.

Is this a rather isolated problem or are there many more potential changes I should be aware of?
How to ensure that the content of the text nodes matches exactly what I'm trying to write?

I guess it's possible just not to use innerHTML, build the nodes and the text nodes and insert them, but it's much less convenient.

Solution

When you read a string from innerHTML, it's not the string you wrote, it's created completely from scratch by converting the DOM structure of the element into HTML that will (mostly) create it. That means lots of things happen:

Newlines are normalized
Character entities are normalized
Quotes are normalized
Tags are normalized
Tags are corrected if the text you supplied defined an invalid HTML structure

...and so on. You can't expect a round-trip through the DOM to result in exactly the same string.

If you're dealing with pure text content, you can use textContent instead:

const x = document.getElementById("x");
const str = "CRLF: \r\n";
x.textContent = str;
console.log(x.textContent === str);

<div id="x"></div>

I can't 100% vouch for there being no newline normalization (or come to that, Unicode normalization; you might run the string through normalize first) although a quick test with a Chromium-based browser, Firefox, and iOS Safari suggested there wasn't, but certainly most of the issues with innerHTML don't occur.