I have a mySQL database full of accented characters. The DB is populated periodically from MS Access 2010.
In Access you'll see é è à Ü. In the export process in Access UTF-8 encoding is specified.
Open the resulting text file in UltraEdit on my PC and you'll see "Vieux Carré" and UE says the encoding is U8-DOS.
The files are uploaded via FTP and imported via LOAD DATA LOCAL INFILE queries like this
LOAD DATA LOCAL INFILE '$dir/$t.$ext' INTO TABLE `$t` FIELDS OPTIONALLY ENCLOSED BY '|' TERMINATED BY ';' LINES TERMINATED BY '\n'
In mySQL the field collation is set to utf8_general_ci.
If you query mySQL from the command line or from phpMyAdmin you'll see "Vieux Carré".
What am I doing wrong?
If you're using LOAD DATA INFILE
with a file that has a charset that's different from your database's default, you need to specify what character set the file is in:
LOAD DATA LOCAL INFILE '$dir/$t.$ext'
INTO TABLE `$t`
CHARACTER SET utf8
FIELDS OPTIONALLY ENCLOSED BY '|' TERMINATED BY ';'
LINES TERMINATED BY '\n'
Note that the database character set is database-wide, and is different than the table character set and the column character set. SHOW CREATE DATABASE database_name
will show you the database charset.