Search code examples
mysqlunicodeutf-8special-charactersdhtmlx

Garbled special characters with SQL rendering of XML data


I have a DHTMLX grid on a page that saves data through a php connector file to a DB. The data from the grid is shown through xml encoding that is rendered in the PHP connector file.

Japanese words in the grid show up in Japanese but get saved as: ーダー However they do stay in Japanese in the grid! (somehow...) If I save something in the DB on php myadmin, it shows up in the grid as: ???

I checked and everything seems right...
DB fields: UTF-8 √
HTML headers: UTF-8 √
connector.php: UTF-8 √ (checked through network tab, devtools)
Is there anywhere else I should check?

When looking at the PHP file that gives me the DB values, I get XML data that's already garbled:

 <rows><row id='00000000001'><cell><![CDATA[]]></cell><cell><![CDATA[??]]></cell><cell><![CDATA[33]]></cell><cell><![CDATA[]]></cell><cell><![CDATA[]]></cell><cell><![CDATA[?????????]]></cell>...

So maybe the problem lies before the data is received from the server. Does anyone know where I should look for the problem?


Solution

  • Were you expecting ーダー for ーダー? (Mojibake.)

    Other times, do you get question marks?

    Those two symptoms come from different causes. But both usually involve not declaring the client bytes to be utf8. In php, that can be done with mysqli_set_charset('utf8')

    Question marks usually also involves failing to declare the column to be utf8.

    To further diagnose, please do

    SELECT col, HEX(col) FROM tbl WHERE ...
    

    so we can see whether the text was mangled as it was inserted.