Search code examples
mysqlphpmyadmincollationphp4

Changing Column Collation - Is it Safe?


I have a table in my ticketing system with 15,000 records, each containing an email which has been converted into a message to be added to a ticket.

Our current problem is with collation - when I got my hands on the system it was using latin1_swedish_ci. However we use several European languages in the system and for that reason we need to be able to correctly store non-ASCII characters.

I was unable to get this to work with the latin1_swedish_ci collation but I have found on my test version of the system that switching the collation to utf8_bin solves the problem.

So I need to know if it will be safe to make this change to my table/column on the live system. Will this take a long amount of time (PHPMyAdmin is pretty horrible when you try to make it work really hard) or will it corrupt any existing data?


Solution

  • It's safe to convert from one charset to an other when all characters from the first charset are representable in the second charset.

    This is the case for latin1 to utf8: it's safe.

    However you have to ensure that the application itself can handle utf8 data.

    On utf8_bin: The utf8 part is the charset (how characters are encoded) and the bin part is the collation. Don't use bin, it would make everything case-sensitive, which is probably not what you expect. Try utf8_unicode_ci instead. (See http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-sets.html )