The system is a PHP application accessing a MySQL database. The first tables were created with the standard latin1 encoding and filled via mysqli in PHP, without setting any encoding. The PHP scripts and data all work in UTF-8.
The newer tables have set the encoding to utf8_bin and in front of every transaction SET CHARACTER SET utf8
is sent.
If I look at the newer tables in the mysql database (via a sql explorer like HeidiSQL) every special character is displayed correctly. However, in every older table the typical latin1-utf8 errors are visible like Ãœ for ü.
Is there an easy way to fix this like in one of the following ways:
SET CHARACTER SET utf8
or latin1
in front of every statement, fitting the tables encoding. (merely a workaround)SET CHARACTER SET utf8
has to be sent at the beginning of every mysqli connection (or maybe there is a way to set this as standard?)SET CHARACTER SET utf8
in front of transactions anymore, but wrong encoding in the database explorer.It seems as if the Database takes all tables as utf8 and shows latin1 tables therefore with wrong characters. Mysqli takes all tables as latin1 if not told different.
The application is productive, the encoding problem is not visible to the user, as the right encoding is told to mysqli in front of every statement. But I feel like that is not a good practice.
I recognize that there is something wrong with how the database is set up, and I hope to learn what is the best practice to fix this.
The first tables were created with the standard latin1 encoding and filled via mysqli in PHP
So, those tables have garbage. Do you need to recover the data?
in front of every transaction SET CHARACTER SET utf8 is sent.
That's useless. Instead, do SET NAMES utf8
once after connecting.
Ü
is Mojibake for Ü
. Search this forum for Mojibake.
Having some tables latin1 with utf8 bytes, while other tables are utf8 -- that will be a painful mess going forward. Stop now and clean up the mess. Go utf8 all the way.
Summary of Mojibake:
SET NAMES latin1
(or set_charset('latin1')
or ...), probably by default. (It should have been utf8
.)CHARACTER SET latin1
. (Or possibly it was inherited from the table/database.) (It should have been utf8
.)CHARACTER SET utf8
, but it should have been that.If you need to fix for the data it takes a "2-step ALTER", something like
ALTER TABLE Tbl MODIFY COLUMN col VARBINARY(...) ...;
ALTER TABLE Tbl MODIFY COLUMN col VARCHAR(...) ... CHARACTER SET utf8 ...;