I would like to take a list of company names, compare them to a list of common misspellings and replace the misspellings with the correct spelling in PROC SQL. This is currently done using a nested TRANWRD (see example below with just three replacements), but there are currently ~70 that I would like to make.
TRANWRD(TRANWRD(TRANWRD(COMPANY_NAME
,'LIMITED','LTD')
,'LIMITE','LTD')
,'LIMTED','LTD')
Is there a better way to do this than just adding more TRANWRD statements? Ideally I would have a table of misspellings and join that table onto it somehow.
Given a dataset of corrections like:
data spelling ;
input from : $20. to : $20. ;
cards ;
Limited LTD
Limite LTD
Limted LTD
;
There are lots of options for using this to generate code. One approach would be to generate one assignment statement per record. So instead of nesting tranwrd calls, you generate assignment statements like:
Company_Name=tranwrd(Company_Name,'Limited','LTD') ;
Company_Name=tranwrd(Company_Name,'Limite','LTD') ;
Company_Name=tranwrd(Company_Name,'Limted','LTD') ;
Common ways to generate that would be with PROC SQL to create a macro variable, PUT statements to write an %include file, or CALL EXECUTE. The PROC SQL approach would look like (untested):
proc sql noprint;
select "Company_Name=tranwrd(Company_Name,'"||trim(from)||"','"||trim(to)||"') ;"
into :Corrections separated by " "
from spelling
;
quit ;
%put %bquote(&corrections);
data want;
set have;
&corrections
run;