Search code examples
mysqlasciicastinglatin1

How do I convert a column to ASCII on the fly without saving to check for matches with an external ASCII string?


I have a member search function where you can give parts of names and the return should be all members having at least one of username, firstname or lastname matching that input. The problem here is that some names have 'weird' characters like the é in Renée and the user doesn't wanna type the weird character but the normal ASCII substitute e.

In PHP I convert the input string to ASCII using iconv (just in case someone types weird characters). In the database however I should also convert the weird chars to ASCII (obviously) for the strings to match.

I tried the following:

SELECT
  CONVERT(_latin1'Renée' USING ascii) t1, 
  CAST(_latin1'Renée' AS CHAR CHARACTER SET ASCII) t2;

(That's two tries.) Both don't work. Both have Ren?e as output. The question mark should be an e. It's alright if it outputs Ren?ee since I can just remove all question marks after the convert.

As you can imagine, the columns I want to query are encoded Latin1.

Thanks.


Solution

  • You don't need to convert anything. Your requirement is to compare two strings and ask if they're equal, ignoring accents; the database server can use a collation to do that for you:

    Non-UCA collations have a one-to-one mapping from character code to weight. In MySQL, such collations are case insensitive and accent insensitive. utf8_general_ci is an example: 'a', 'A', 'À', and 'á' each have different character codes but all have a weight of 0x0041 and compare as equal.

    mysql> SET NAMES 'utf8' COLLATE 'utf8_general_ci';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SELECT 'a' = 'A', 'a' = 'À', 'a' = 'á';
    +-----------+-----------+-----------+
    | 'a' = 'A' | 'a' = 'À' | 'a' = 'á' |
    +-----------+-----------+-----------+
    |         1 |         1 |         1 |
    +-----------+-----------+-----------+
    1 row in set (0.06 sec)