Search code examples
sql-servert-sqlssisaccent-insensitive

SQL - Accented characters not updating


Hello everyone and welcome to my nightmare. I am preparing a database in order to run a number of SSIS packages to migrate the required data into a data warehouse.

I am in the process of updating customer names with accented characters to non-accented characters prior to running the SSIS package, as this causes my DimCustomer Population package to fail.

The query I used ran and ran quickly and efficiently (I was living the dream) however, I am now faced with three records which cause my UPDATE statement to run and run to the point I have to cancel the query after about an hour as this is nowhere near acceptable.

The characters contained in these names are: í á and . My suspicion is that these are only recognised in a particular format (which I cannot seem to find).


The queries and update I run are below for your perusal:

SELECT lastname COLLATE latin1_general_ci_ai      AS 'LastName',
       MIN(lastname COLLATE latin1_general_ci_as) AS 'CorrectLastName',
       MAX(lastname COLLATE latin1_general_ci_as) AS 'IncorrectLastName'
INTO   #lastname
FROM   learner WITH (nolock)
GROUP  BY lastname COLLATE latin1_general_ci_ai
HAVING MIN(lastname COLLATE latin1_general_ci_as) <> MAX(lastname COLLATE latin1_general_ci_as);

IF Object_id ('tempdb..#LastNameUpdate') IS NOT NULL
  DROP TABLE #lastnameupdate

SELECT L.learnerid,
       L.lastname,
       LN.correctlastname,
       LN.incorrectlastname
INTO   #lastnameupdate
FROM   learner L WITH (nolock)
       INNER JOIN #lastname AS LN
               ON L.lastname = LN.incorrectlastname

UPDATE L
SET    L.lastname = LNU.correctlastname
FROM   learner L
       INNER JOIN #lastnameupdate AS LNU
               ON L.learnerid = LNU.learnerid  

Please ask any questions if you need anymore info, I am a regular stack checker for pre-posted solutions and (even if a little sad) general reading. Any information, guidance or advise would be greatly appreciated. Thanks!


Solution

  • Just for speeding things up: you can add the following after creating the #lastname table.

    Delete from #lastname where CorrectLastName = incorrectLastName
    

    I think if you have only a few incorrect last names this will speed a lot up.

    You can also try to avoid the creation of the table #lastnameupdate.
    Because with a the use of a sub query instead you could do the same.
    This should also speed the whole up a little bit.