How do I convert data like that
\u0441\u043e\u0432\u0440\u0435\u043c\u0435\u043d
back to normal UTF8 inside MySQL? Is it even possible?
You can convert between charsets in mysql by reading the data as binary and convert it to a target charset, like this (if you have a table called example
and a column called data
:
UPDATE `example` SET data=CONVERT(CONVERT(`data` USING binary) USING utf8);
The JSON \uXXXX entities are in hexadecimal UTF-16 encoding, so if you have a mysql version with utf-16 support, you can convert them to utf-8. The function below shows you how to do it. First, you unhex() those values and then convert from UTF-16 to UTF-8:
DELIMITER @@
CREATE FUNCTION Unjson (instring TEXT CHARACTER SET utf8)
RETURNS TEXT CHARACTER SET utf8
BEGIN
DECLARE i INT DEFAULT 0;
DECLARE c VARCHAR(1);
DECLARE utfstr TEXT CHARACTER SET utf16 DEFAULT "";
DECLARE outstring TEXT CHARACTER SET utf8 DEFAULT "";
WHILE i < CHAR_LENGTH(instring) DO
SET i = i + 1;
SET c = SUBSTRING(instring, i, 1);
IF c = "\\" THEN
SET c = SUBSTRING(instring, i + 1, 1);
IF c = "u" THEN
SET utfstr = CONCAT(utfstr, UNHEX(SUBSTRING(instring, i + 2, 4)));
SET i = i + 5;
END IF;
ELSE
IF utfstr != "" THEN
SET outstring = CONCAT(outstring, CONVERT(utfstr USING utf8));
SET utfstr = "";
END IF;
SET outstring = CONCAT(outstring, c);
END IF;
END WHILE;
IF utfstr != "" THEN
SET outstring = CONCAT(outstring, CONVERT(utfstr USING utf8));
END IF;
RETURN outstring;
END@@
DELIMITER ;
With this MySQL function in place, you can convert your tables with:
UPDATE `table_name` SET `column_name`=Unjson(`column_name`);
I'm using MySQL 5.5, but I don't think 5.0 has utf-16 support, so you might have to check your MySQL version...
(And yeah, I recommend you to make a backup before running this in your production environment...) ;)