Search code examples
sqlsql-servert-sqlsql-server-2012string-comparison

compare strings of uneven length in TSQL


I have two tables. Both of them contain (dutch) postalcodes. Those have the format 9999AA and are stored as varchar(6). In the left table the codes are complete

John Smith        1234AB
Drew BarryMore    3456HR
Ted Bundy         3456TX
Henrov            8995RE
My mother         8995XX

In the right table the codes can be incomplete

1234AB Normal neigbourhood
3456   Bad neighbourhood
8995R  Very good neighbourhood

I need to join these tables on the postalcodes. In this example the output would have to be

John Smith        Normal neighbourhood
Drew BarryMore    Bad neighbourhood
Ted Bundy         Bad neighbourhood
Henrov            Very good neighbourhood
My mother         -unknown-

So I have to join the two tables based on the length of the postal code in the right table.

Any suggestions as to how to do this? I could only come up with a CASE in the ON statement but that was not so smart ;)


Solution

  • If you have no "duplicates" in the second table, you could use like:

    SELECT t1.*, t2.col2
    FROM table1 AS t1
    JOIN table2 AS t2
    ON t1.postalcode LIKE t2.postalcode + '%';
    

    However, this is not going to be efficient. Instead, an index on table2(postalcode) and a series of LEFT JOINs is probably faster:

    SELECT t1.*, COALESCE(t2a.col2, t2b.col2, t2c.col2)
    FROM table1 t1
    LEFT JOIN table2 t2a ON t2a.postalcode = t1.postalcode
    LEFT JOIN table2 t2b ON t2b.postalcode = LEFT(t1.postalcode, LEN(t1.postalcode) - 1)
    LEFT JOIN table2 t2c ON t2c.postalcode = LEFT(t1.postalcode, LEN(t1.postalcode) - 2)
    

    This can take advantage of an index on table2(postalcode). In addition, it only returns one row, even when there are multiple matches in table2, returning the best match.