I would like to enable my users an option to enter EMoji characters in an input field. I assume that in 2019 this should be as trivial as setting the meta charset of the website to UTF-8. However when tested in Chrome or Firefox the below example counts supplementary UTF-8 characters (with length 4 bytes) differently.
In the first input I can only enter 2 more characters after the poop. In the second input I can still enter 3 more characters after ‰
which is 3 bytes long.
What is causing this inconsistent behaviour? Is there another HTML meta setting for 4 byte characters? It worked fine in Edge 17. Even trash IE 11 counts the length correctly.
<input type="text" value="💩" maxlength="4" />
<input type="text" value="‰" maxlength="4" />
My Test cases: http://jsfiddle.net/L726ryea/7/
The HTML5 spec says that maxlength
applies to the JavaScript string length which is the number of UTF-16 code units. So codepoints beyond 0xFFFF like Emojis count as two code units. This explains the behavior you're seeing.