I've come across a problem with special characters on a django query (Django 1.9.2).
I've created a model that stores a word in it, and I'm feeding that model with words from a Spanish dictionary, using code as follows:
MyModel.objects.get_or_create(word=myword)
And now I've realized that words containing special characters haven't been added, so, for example, there is only one row of MyModel in the database for año
and ano
! And when I query the database I retrieve the same object for these two queries:
MyModel.objects.get(word='año')
MyModel.objects.get(word='ano')
...and no, those words are not the same ;D
I would want to create one object for each, of course.
Short answer: You probably want COLLATION utf8_spanish2_ci
.
Long answer:
If you are using CHARACTER SET utf8
(or utf8mb4
) on the column/table in question, and if you need ano
!= año
, you need COLLATION utf8_bin
or utf8_spanish_ci
or utf8_spanish2_ci
. All other utf8 collations treat n
= ñ
. spanish2 differs from spanish in that ch
is treated as a separate "letter" between c
and d
. Similarly for ll
. More details.
Note that other 'accents' are ignored in comparisons for most utf8 collations except for utf8_bin. For example, C
= Ç
(except for _bin and _turkish).