Search code examples
sasoutputregressionlinear-regressionsas-studio

Proc reg: checking the best three variables in parallel


I am working on a macro for regressions using the following code:

%Macro Regression;

%let index = 1;

%do %until (%Scan(&Var2,&index," ")=);

%let Ind = %Scan(&Var2,&index," ");

ods output SelectionSummary = SelectionSummary;

proc reg data = Regression2 plots = none;

model &Ind = &var / selection = stepwise maxstep=1;

output out = summary R = RSQUARE;

run;

quit;

%if &index = 1 %then %do;

data final;
set selectionsummary;
run;

%end;

%else %do;

data final;
set final selectionsummary;
run;

%end;

%let index = %eval(&Index + 1);

%end;

%mend;

%Regression;

This code works and gives me a table which highlights the independent variable that explains with the most variation the dependent variable.

I'm looking for a way to run this but the regression gives me the three best independent variables to explain the dependent variable if it was chosen to be the first variable, for example:

models chosen:

GDP = Human Capital
GDP = Working Capital
GDP = Growth

DependentVar Ind1          Ind2            Ind3    Rsq1 Rsq2 Rsq3
GDP          human capital working capital growth  0.76 0.75 0.69

or

DependentVar Independent1    Rsq
GDP          human capital   0.76
GDP          working capital 0.75
GDP          growth          0.69

EDIT:

It would be an absolute bonus if there is a way to put stepwise maxstep = 3 and have the best three independent variable combinations for each dependent variable with the condition that the first independent variable is unique.

TIA.


Solution

  • Try STOP=3 option on your model statement. It will fit the best model with up to three variables. However, it does not work with the stepwise option, but will work with the R^squared option.

    model &Ind = &var / selection = maxR stop=3;
    

    If you only want to consider 3 variable models include start=3 as well.

    model &Ind = &var / selection = maxR stop=3 start=3;