Search code examples
sqlstringreferencematchfuzzy-search

SQL - comparing strings from two tables (fuzzy match...sorta)


I have searched the questions and there are similar questions asked but no solution that I think I can use. This question is similar to a fuzzy match...sorta. I need help with comparing two tables. One table is a company reference table and the other is a table that get's raw company data imported into it daily. The reference table is clean and has a company id associated with every single company. The daily data that is imported does not have a company id. What I'm trying to do is have the daily data reference the 'Company Reference Table' on company_name and update the 'Company Table's column company_state based on the company_name. Unfortunately, the daily data string for company_name coming in is not always the same each day. There can be various characters (a-z, 0-9, +, -, .) and spaces in front or after the actual company name with varying lengths daily so I do not believe I can use charindex to clean it up.

Company Reference Table

company_id  company_name  company_state
1           Awesome Inc   NY
2           Excel-guru    AL
3           Clean All     MI 

Company Table

company_name              company_state
abc123 Awesome   Inc      NULL
Excel gur xyz-987         NULL
Clean All Cleanall        NULL

What I want it to do is this. Sorta like a fuzzy match.

Company Table

company_name              company_state
abc123 Awesome   Inc      NY
Excel gur xyz-987         AL
Clean All Cleanall        MI

Any help is much appreciated. Thank you.


Solution

  • Try below Query to update company table:

    update company c INNER JOIN company_ref cr
    ON c.company_name LIKE concat('%', cr.company_name, '%') 
    SET c.company_state = cr.company_state;
    

    Another way just by using SELECT

    SELECT c.*, cr.* FROM company c INNER JOIN company_ref cr
    ON c.company_name LIKE concat('%', cr.company_name, '%');
    

    SQL Fiddle: http://sqlfiddle.com/#!2/ec76f/1