I have two SAS tables which are the same, only the column names aren't the same.
The first table D1
has 80 column names that have the following pattern X1000_a010_b020
and the second table D2
has 80 column names that have the following pattern X_1000_a0010_b0020
. Please note that they are not in the same order.
I want to make sure that all the columns from D1
have the same names as in D2
. In other words, I want to add the underscore after the X and add a 0 after all the a's and b's.
However I don't how to proceed. I would guess that RegEx would be the go to but I am not familiar with it.
As a structure example, some times ago I was using the following code to replace spaces in a column name with an underscore. I would like to do the same but for the underscore after the X and the 0 after the a's and b's.
%macro rename_vars(table);
%local rename_list sqlobs;
proc sql noprint;
select catx('=',nliteral(name),translate(trim(name),'_',' '))
into :rename_list separated by ' '
from sashelp.vcolumn
where libname=%upcase("%scan(work.&table,-2,.)")
and memname=%upcase("%scan(&table,-1,.)")
and indexc(trim(name),' ')
;
quit;
%if &sqlobs %then %do ;
proc datasets lib=%scan(WORK.&table,-2);
modify %scan(&table,-1);
rename &rename_list;
run;
quit;
%end;
%mend rename_vars;
Your example code seems to show you have a plan for how to implement the renaming so let's just concentrate on generating the OLDNAME <-> NEWNAME pairs. You can generate a list of names in a particular dataset with PROC CONTENTS or querying DICTIONARY.COLUMNS with SQL code (or SASHELP.VCOLUMN with any tool). So let's assume you have a dataset named CONTENTS that contains a variable named NAME. So the goal is to create a new variable, which we can call NEWNAME.
So let's just translate the three transformations you say you need directly into individual actions. You can collapse the steps if you want, but there is no pressing need for efficiency in this operation.
data fixed_names;
set contents;
newname = tranwrd(upcase(name),'_A','_A0');
newname = tranwrd(newname,'_B','_B0');
newname = cats(char(newname,1),'_',substr(newname,2));
keep name newname;
run;
Now you could pull that list into a macro variable. So a space delimited list of old=new
pairs is useful for rename.
proc sql noprint;
select catx('=',name,newname) into :renames
from fixed_names
where newname ne upcase(name)
;
quit;
Or if the goal is to literally compare the two datasets you might want to generate one list of old names and a separate list of new names.
select name,newname
into :oldlist separated by ' '
, :newlist separated by ' '
from fixed_names
;
Which you could then use with PROC COMPARE directly without any need to rename any variables.
proc compare data=DS1 compare=DS2 ;
var &oldlist;
with &newlist;
run;