I encounter a need to perform the following action but I am not sure how to achieve this.
I have a table with variables I need to perform regression and the table looks like
Colummn name | Y | X1 | X2 | X3 | ....| X10 |
Row 1 __________ | - | --- | --- | --- | ....| --- |
Row 2 __________ | - | --- | --- | --- | ....| --- |
I want to run 10 regressions and compare the adjusted R Squared of each model, with the regression code being:
proc reg data = table outest = out_table;
model modelN /adjrsq;
run;quit;
In the 10 models, the Y variable remains unchanged, with the X variable to increase by 1 each time:
Model1: Y = X1;
Model2: Y = X1 +X2
Model3: Y = X1+ X2 +X3
.....
Model10: Y = X1+X2+X3+....+X10
I want to write a
DO i = 1 to 10
to perform the task for simplicity and convenience. The final output could be 10 out_tables from the PROC REG codes. However, I don't know how to refer to the dependent variable in the loop. Could anybody help with this issue? Thanks!
If your x variables are named x1, x2, x3, etc., you can use the -
shortcut to select them in order.
%macro model(n);
%do i = 1 %to &n;
model&i: model y=x1-x&i /adjrsq;
%end;
%mend;
proc reg data=have outest=out_table;
%model(10);
run;
If the variables are not sequentially named, you'll have to do a little extra prep code. In this case, we'll read all the input variables from a pipe-separated list and use --
instead. --
selects all columns between two columns that you specify, including those two columns. e.g. foo--bar
would select foo
, bar
, and all variables in-between foo
and bar
.
/* Create a pipe-separated list of all input variables, making sure to
exclude the dependent variable */
proc sql noprint;
select name
into :varlist separated by '|'
from dictionary.columns
where memname = 'HAVE'
AND libname = 'WORK'
AND name NE 'Y'
order by varnum
;
quit;
%macro model(n);
%do i = 1 %to &n;
model&i: model y=%scan(&varlist, 1, |)--%scan(&varlist, &i, |) /adjrsq;
%end;
%mend;
proc reg data=have outest=out_table;
%model(10);
run;