I have a table that contains the names of various recording artists. One of them has a dash in their name. If I run the following:
Select artist
, substring(artist,8,1) as substring_artist
, ascii(substring(artist,8,1)) as ascii_table
, ascii('-') as ascii_dash_key /*The dash key next to zero */
, len(artist) as len_artist
From [dbo].[mytable] where artist like 'Sleater%'
Then the following is returned. This seems to indicate that a dash (ascii 45) is being stored in the artist column
However, if I change the where clause to:
From [dbo].[mytable] where artist like 'Sleater' + char(45) + '%'
I get no results returned. If I copy and paste the output from the artist column into a hex editor, I can see that the dash is actually stored as E2 80 90, the Unicode byte sequence for the multi-byte hyphen character.
So, I'd like to find and replace such occurrences with a standard ascii hyphen, but I'm am at a loss as to what criteria to use to find these E2 80 90 hyphens?
Your char is the hyphen, information on it here :
https://www.charbase.com/2010-unicode-hyphen
You can see that the UTF16 code is 2010 so in T-SQL you can build it with
SELECT NCHAR(2010)
From there you can use any SQL command with that car, for example in a select like :
Select artist
From [dbo].[mytable] where artist like N'Sleater' + NCHAR(2010) + '%'
or as you want in a
REPLACE( artist, NCHAR(2010), '-' )
with a "real" dash
EDIT:
If the collation of your DB give you some trouble with the NCHAR(2010) you can also try to use the car N'‐' that you'll copy/paste from the charbase link I gave you so :
REPLACE( artist , N'‐' , '-' )
that you can even take from the string here (made with the special car) so all made for you :
update mytable set artist=REPLACE( artist, N'‐' , '-' )