Search code examples
sql-servermultilingualnon-ascii-characterscollate

SQL Collate statement where clause


I am trying following query in SQL Server 2008 R2. While working with accent sensitivity I found this:

select case when 'Наина' = '毛泽东先生' then 'Match' else 'No Match' end col

I see result is:

'Match'

What could be possibly be the reason for this behavior? I have also tried using COLLATE with different accents but it didn't work.


Solution

  • You should use NVARCHAR in your comparison. Upon checking, both strings are converted to '?????' when using VARCHAR:

    SELECT *
    FROM (VALUES
        (N'Наина', 'Наина'),
        (N'毛泽东先生', '毛泽东先生')
    )t([NVACRHAR], [VARCHAR])
    

    The result of the above query:

    NVACRHAR     VARCHAR
    --------     -------
    Наина        ?????
    毛泽东先生    ?????
    

    So, in order to achieve the desired result, you must use NVARCHAR:

    select 
        case 
            when 'Наина' = '毛泽东先生'  then 'Match' 
            else 'No Match' 
        end match_VARCHAR,
        case 
            when N'Наина' = N'毛泽东先生' then 'Match' 
            else 'No Match' 
        end match_NVARCHAR
    

    Edit: I don' really have the real explanation, the above just shows the result of my testing. Another answer might explain this better.