@I have a row in a table which contains the following text "Urbański, Mariusz". The hex representation for character "ń" is "6e cc 81". So this is stored in a decomposition Unicode normalization form.
When I use a query like the following "...... where Identification = N'Urbański, Mariusz'" and character "ń" matches the decomposition form ("6e cc 81") query returns the expected records.
If I run the exact same query using a Composition Unicode Normalization Form ("ń" = "c5 84") I get no results.
I also tried "Select 1 Where N'Urbański, Mariusz' = N'Urbański, Mariusz'" where I use the 2 variations of "ń" which always returns true.
Is there a way to make SQL Server treat the 2 values as equal?
Here are my database configuration are requested by Rhys Jones
Database Collation : "Danish_Norwegian_CI_AS"
Column1 : IdRightsHolderSourceIdentification = NULL
Column2 : VersionInfo = NULL
Column3 : Source = "Danish_Norwegian_CI_AS"
Column4 : Identification = "SQL_Latin1_General_CP437_BIN"
Column5 : RightsHolder = NULL
The problematic column is, as Rhys Jones very well guessed, column4 and it has a binary collation (that's what BIN in the end means right?). Thanks a lot fo the assistance.
I can see from your question that you understand Unicode so I guess the bit you're missing is that in SQL Server there is something called collation. This is what determines how SQL Server compares values. I've put together a script to demonstrate various successful and unsuccessful comparisons between the two forms of the name. The SQL Server setup I'm currently using uses Latin1_General_CI_AS under which both forms of the name ARE equal. I've tried a few things but I can't make them not equal unless I use a binary collation. I'd be interested to know your server and database collations and to see the table definition (including collations) for your table.
Hope this helps,
Rhys
-- UTF8 character list -- http://www.fileformat.info/info/charset/UTF-8/list.htm?
-- n - LATIN SMALL LETTER N (U+006E) 6e
-- ́ - COMBINING ACUTE ACCENT (U+0301) cc81
-- ń - LATIN SMALL LETTER N WITH ACUTE (U+0144) c584
create table dbo.MyTable (id int not null, name nvarchar(100) collate Latin1_General_CI_AS not null)
declare @a nvarchar(100); set @a = N'Urba' + nchar(0x0144) + N'ski, Mariusz'
declare @b nvarchar(100); set @b = N'Urba' + nchar(0x6e) + nchar(0x0301) + N'ski, Mariusz'
insert dbo.MyTable values (1, @a)
insert dbo.MyTable values (2, @b)
-- Display server, database and column collations
select
SERVERPROPERTY(N'Collation') as [server_collation],
DATABASEPROPERTYEX(DB_NAME(), N'Collation') as [database_default_collation],
c.collation_name as [column_collation]
from
sys.objects t join sys.columns c on c.object_id = t.object_id
where
t.object_id = object_id('dbo.MyTable') and c.name = 'name'
-- Test with Latin1_General_CI_AS
select id, name from dbo.MyTable where name collate Latin1_General_CI_AS = @a collate Latin1_General_CI_AS
-- Test with French_CI_AS
select id, name from dbo.MyTable where name collate French_CI_AS = @a collate French_CI_AS
-- Test with Latin1_General_BIN2
select id, name from dbo.MyTable where name collate Latin1_General_BIN2 = @a collate Latin1_General_BIN2