SQL Server 2008 R2
create table #test (c1 nvarchar(5) not null)
insert into #test values
(N'aaa'),
(nchar(65533)),
(N'bbb')
select * from #test where c1 like N'%�%'
select * from #test where c1 like N'%'+nchar(65533)+N'%'
Result is
c1
----
aaa
�
bbb
Why? I haven't found in MSDN any notes on this case.
That character (and many others, depending on the Collation version being used) just happens to have no sort weight defined. It is effectively nothing. So whether you have 1 instance of it or 100, it is invisible to anything but binary Collations. Meaning, the following WHERE
predicates:
LIKE N'%' + NCHAR(0xFFFD) + N'%'
LIKE N'%' + NCHAR(0xFFFD) + NCHAR(0xFFFD) + N'%'
LIKE N'%' + NCHAR(0xFFFD) + NCHAR(0xFFFD) + NCHAR(0xFFFD) + N'%'
and so on, are all equivalent to the following:
LIKE N'%%'
And that is why you get all 3 rows returned.
This does not mean that this character should have no sort weight. It actually is defined in Unicode as having a weight, but for some reason Microsoft has left quite a few characters without any sort weight at all (though the total number of characters missing a sort weight is decreasing with each new Collation version, the newest being the version 140
Collations, which came with SQL Server 2017 and are only for the Japanese Collations).
For any character that has no sort weight, the only way to match it is by using a binary Collation. Binary Collations are those ending in either _BIN
or _BIN2
, but only use the _BIN2
Collations since they sort properly whereas the older _BIN
Collations do not. For example:
SELECT * FROM #test WHERE c1 LIKE N'%�%' COLLATE Latin1_General_100_BIN2;
Returns:
c1
----
�
Also, I tested with the following and they returned all 3 rows:
So, the following should be good:
Also, it is best to use the highest Collation version available for the Collation you are trying to use. For example, use Latin1_General_100_*
instead of Latin1_General_*
, and so on. Use the following query to find the Collations available on your Instance:
SELECT col.*
FROM sys.fn_helpcollations() col
ORDER BY col.[name];