Search code examples
phphtmlmysqlhtmlspecialchars

Accents in the database


I am creating a database with MySql. I use collation utf8. I use a European language that has accents and special characters like ç.

What is the best way to store text in the database, with or without special characters? For example, should I use différent or diffdifférent (different in French) in the database? (This means, I should convert with htmlspecialcharts before or after I store the text in the database?)

I tried and both ways work well. But is there any reason that makes an option more recommended for any technical reason or any option is ok. I want to be sure now that I begin the database. Later it will be harder to change.


Solution

  • I think you should definitely NOT replace your characters with HTML entities: that is a standard for XML, not for everything!

    For instance, if you had to serve JSON for some reason, you would then be forced to XML-decode your text, then serve it as JSON, where UTF-8 characters are encoded in a different way.

    Also, converting characters would make your stored strings much less human-readable (thus less human-searchable): Le premier écoquartier d’Île-de-France a été inauguré would be encoded into something absolutely devilish.

    Let your MySQL do the hard job, taking care of non-ASCII characters.