Search code examples
sqlsql-serversql-server-2008sql-likeidentify

SQL: Correctly identify and correct(if possible) names in database


I have a large database of names, and I'm hoping to identify incorrect capitalization. Right now I'm using the following...

SELECT *
  FROM myTable
 WHERE LastName LIKE '%Mcd%'  COLLATE SQL_Latin1_General_Cp1_CS_AS

Now of course this is inefficent because I have to run/edit this over and over for different cases. My thinking is find a list of name cases that would provide possible problems, and do LIKE IN ('case1','case2','case3','case4', ...)

if that's possible. Is there another way that I'm not thinking of?

Other cases I'm thinking I'll have to check are abbreviations (%.%), hypens (%-%), and apostrophes (%'%).


Solution

  • You could use

    SELECT *
      FROM myTable
     WHERE LastName LIKE '%Mcd%' or LastName LIKE '%Foo%' 
    

    Or

    WITH T(Word) AS
    (
    SELECT 'Mcd' UNION ALL
    SELECT 'Foo'
    )
    
    SELECT *
    FROM myTable
    JOIN T ON LastName LIKE '%' + Word + '%' 
    

    To avoid needing to scan myTable multiple times.

    To avoid processing the string multiple times you could use CLR and Regular Expressions.