Search code examples
mysqlunicodeutf-8mysql-workbenchmultiple-languages

What is the best utf8 collation in workbench for the data on many languages


I'm developing a website(enter link description here) to give simple,understandable meanings for the terms in different languages. I'm planning to add many languages as possible when the site grows. So the terms and some other stuff could be in any language and I need to create the database to do this without any issue.

I'm using mysql workbench to design the table schema.

It should be able to comapre the words in any languages when searches and also it should be as fast as possible on retrieving data

So what is the best choice to do that ? utf8-default or anything else ?

The Languages I'm focused up to now is:

  1. Sinhala(Sinhalese)
  2. English
  3. Russian
  4. Spanish
  5. French
  6. German
  7. Chinese
  8. Japanese
  9. Danish
  10. Tamil
  11. Hindi
  12. Italian

Solution

  • As you are using multiple language on your database, I think utf8_unicode_ci might be the one for you.

    Accuracy

    • utf8_unicode_ci is based on the Unicode standard for sorting, and sorts accurately in a very wide range of languages.

    Performance

    • utf8_unicode_ci uses a much more complex comparison algorithm which aims for correct sorting according in a very wide range of languages. This makes it slower to sort and compare large numbers of fields.