I need to find State abbreviations in column name my_column
. This column can contain values like these
John Smith of AZ
--> Match
John Smith of AZ(Tucson)'
--> Match
AZ John Smith
--> Match
John Smith AZ for Tucson
--> Match
Utah Jazz
--> Don't Match
Azyme
--> Don't Match
'Hazy
--> Don't Match
I tried using CASE
expressions with CHARINDEX
and LIKE
to do all this matching, but it is getting super-ugly. I wonder if there is a better way.
I asked our DBA to install Full-Text Index to see if I can do something better with CONTAINS, but not sure if it really helps.
How about something like
declare @testcases table (testval varchar(50));
insert into @testcases
values
('John Smith of AZ'),
('John Smith of AZ(Tucson)'),
('AZ John Smith'),
('John Smith AZ for Tucson'),
('Utah Jazz'),
('Azyme'),
('Hazy')
select PATINDEX('%[^A-Z]AZ[^A-Z]%',testval) + PATINDEX('AZ[^A-Z]%',testval) + PATINDEX('%[^A-Z]AZ',testval)
from @testcases;
It'll match anything containing AZ, that doesn't border an alpha-numeric character. Mind you, this isn't very performant, but it will work.
Also it only works for Arizona. To find the other 49 states you'd have to scan again. I would probably try to extract the state in your application before inserting into the database.