Search code examples
htmlunicode

Displaying unicode symbols in HTML


I want to simply display the tick (✔) and cross (✘) symbols in a HTML page but it shows up as either a box or goop ✔ - obviously something to do with the encoding.

I have set the meta tag to show utf-8 but obviously I'm missing something.

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Edit/Solution: From comments made, using FireBug I found the headers being passed by my page were in fact "Content-Type: text/html" and not UTF-8. Looking at the file format using Notepad++ showed my file was formatted as "UTF-8 without BOM". Changing this to just UTF-8 the symbols now show correctly... but firebug still seems to indicate the same content-type.


Solution

  • You should ensure the HTTP server headers are correct.

    In particular, the header:

    Content-Type: text/html; charset=utf-8
    

    should be present.

    The meta tag is ignored by browsers if the HTTP header is present.

    Also ensure that your file is actually encoded as UTF-8 before serving it, check/try the following:

    • Ensure your editor save it as UTF-8.
    • Ensure your FTP or any file transfer program does not mess with the file.
    • Try with HTML encoded entities, like &#uuu;.
    • To be really sure, hexdump the file and look as the character, for the ✔, it should be E2 9C 94 .

    Note: If you use an unicode character for which your system can't find a glyph (no font with that character), your browser should display a question mark or some block like symbol. But if you see multiple roman characters like you do, this denotes an encoding problem.