I use the below code to create an indicator that is 0 if the values in the code are not found in a record of another table and 1 if all pertinant values match a record on another table.
proc sql;
create table test as
select id
,a.company_yr in (select company_yr from table2)
and a.industry in (select industry from table2)
and a.sector in (select sector from table2) as match_ind
from work.table1 a;
quit;
My problem is that the company_yr, industry and sector aren't always a perfect match because of abbreviations or other mix ups in the data (e.g., 'FORD MOTORS' in table1 and 'FORD' in table2). I need some way to use a LIKE statement or INDEX statement in conjunction with a TRIM statement to allow me to match parts of the string to make the indicator more accurate. I haven't been able to find a way to effectively accomplish this yet.
Try something like this where you replace the "strip()" function with whatever you want to accomplish the desired normalization, perhaps with nested "compress()" functions, and some "upcase()" for good fun. You could turn the equality in the subquery to a LIKE, but it's not clear what you'd match on.
proc sql;
create table test as
select
id,
exists (select 1 from work.table2 b
where strip(a.industry) = strip(b.industry)
and strip(a.sector) = strip(b.sector)
) as match_ind
from work.table1 a;
quit;