I am using Perl 5.20.2 and MySQL 5.5.57 on a Debian 8 machine. I recently discovered that MySQL's utf8 tables are limited to three-byte-characeters. As a consequence I can not store emojis. So, I tried utfmb4 tables which are supposed to address the issue. I changed the table from utf8 to utf8mb4 from inside the mysql client:
ALTER DATABASE `mydb` CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
ALTER TABLE `mydb`.`mytable` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE `mydb`.`mytable` CHANGE `object` `object` VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Storing data in mytable seems to work, at least I can see the expected emoji in phpMyAdmin. However, when reading from the table I receive a 4-character result with 3 unprintable charaters. The following program is supposed to print the same emoji twice:
#!/usr/bin/perl
use 5.10.1;
use warnings;
use strict;
use DBI;
binmode(STDOUT, ':utf8');
my $object = "\x{1F600}";
my $hd_db = DBI->connect('DBI:mysql:mydb:localhost', 'user', 'password');
$hd_db->do('SET NAMES utf8mb4');
# cleanup
my $delete = $hd_db->prepare("DELETE FROM mytable");
$delete->execute;
my $insert = $hd_db->prepare("INSERT INTO mytable (object) VALUES ('" . $object . "')");
$insert->execute;
my $select = $hd_db->prepare("SELECT * FROM mytable");
$select->execute;
my $row = $select->fetchrow_hashref;
say $object;
say $row->{'object'};
Expected output:
😀
😀
Actual output:
😀
�
Seems like a bug to me. Any suggestion how to work around it?
EDIT: SELECTing the data from within the mysql client also shows the expected emoji
mysql> SET SESSION CHARACTER_SET_CLIENT = utf8mb4;
mysql> SET SESSION CHARACTER_SET_RESULTS = utf8mb4;
mysql> SELECT * FROM mytable;
+--------+
| object |
+--------+
| 😀 |
+--------+
You told MySQL to use UTF-8 for communication, but you also need to tell DBD::mysql to decode the data (or do it yourself).
You want
my $dbh = DBI->connect('DBI:mysql:mydb:localhost', 'user', 'password', {
mysql_enable_utf8mb4 => 1,
})
or die($DBI::errstr);
which is equivalent to
my $dbh = DBI->connect('DBI:mysql:mydb:localhost', 'user', 'password')
or die($DBI::errstr);
$dbh->do('SET NAMES utf8mb4')
or die($dbh->errstr);
$dbh->{mysql_enable_utf8mb4} = 1;