Search code examples
htmlfacebook-liketwig

Website Outputting HTML Number Instead of Asian Characters


My website is being rendered in Twig. I have a meta tag that is rendering out HTML Numbers instead of the actual Chinese character. I am wondering how would I get it to render the actual Chinese character?

I've noticed that if I do this with a <p>, the actual source code does render with HTML Numbers, but it looks like the browser translates it to the actual Chinese character.

Note that the issue I have here is with Facebook. When Facebook scrapes my page, they read the og:description value which is presented as HTML entities.

code: <meta name="og:description" content="{{ product.description }}"/>

actual output: <meta name="og:description" content="&#28858;&#12298;&#27054;&#35709;&#21235;&#31456;&#65306;&#37941;&#34880;&#24717;"/>

expected output: <meta name="og:description" content="為《榮譽"/>


Solution

  • First, make sure the page is UTF-8 encoded and declared as UTF-8. See W3C material on character encodings if needed.

    Then, convert the character references like #28858; to corresponding characters, in UTF-8 encoding, when generating the HTML document using the database. In PHP, you can use mb_decode_numericentity for this.