Search code examples
phpmysqlpdoutf8mb4

PHP/mysql site using utf8mb4 won't retrieve emojis correctly from database, despite utf8mb4 being specified everywhere that I can find to put it


I converted my mysql 5.7 database to utf8mb4 today.

As a test, I've put a poo emoji (💩) in a field, using Navicat.

It displays fine in Navicat, even if I quit and re-open the program.

The problem comes when I retrieve it using PDO in my script and echo it out to the browser; I get ������ (6 diagonal black question markes).

If I copy the character directly into my script and echo that, it works.

So it's not a problem with mysql.

It's not a problem with the font.

It's not a problem with my browser.

It's not a problem with php echoing.

So... it's a problem with PDO?

This is my PDO code:

$PDO = new PDO('mysql:host='.DB_SERVER.';dbname='.DB_NAME.';charset=utf8mb4', DB_USER, DB_PASSWORD, [
    PDO::ATTR_ERRMODE            => PDO::ERRMODE_EXCEPTION,
    PDO::ATTR_DEFAULT_FETCH_MODE => PDO::FETCH_ASSOC,
    PDO::ATTR_EMULATE_PREPARES   => false
]);
$PDO->query("SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'");

$rs = $PDO->query("SELECT name FROM users WHERE id = 1000");  // name has been set to 💩

while ($a = $rs->fetch()) {
    print_r($a);
}

If I check the output of SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; in php in the browser every output is correctly utf8mb4/utf8mb4_unicode_ci (apart from character_set_system which is simply utf8, but I think that's normal?)

What else could it be?


Solution

  • The problem was an outdated version of Navicat. Upgrading from 8 to 11 has fixed my issues. I realised this was probably the cause when I discovered that emojis INSERTed via php were correctly stored and retrieved in PHP, but wouldn't show correctly in Navicat.

    Similarly the opposite was true; emojis set in Navicat would correctly store and retrieve in Navicat, but not in PHP.

    Looking at the encoding options for navicat connections, it had a checkbox for "use mysql encoding", but if left unchecked the dropdown of possible options included only utf8, not utf8mb4. I guess navicat 8 pre-dates that being in common usage.

    A quick upgrade, and everything works perfectly.