Search code examples
saslogistic-regressionsas-studio

proc logistic output whether the fitted model predicted a 1 or 0


I am taking a dataset which replicates the below

DATA HAVE
    (DROP=I);
    DO I = 1 TO 100;
        Y = RAND("Integer",0,1);
        X1 = I ** RANUNI(I);
        X2 = I ** I ** RANUNI(I);
        output;
    END;
RUN;

And I fit a logistic regression to this dataset like so,

PROC LOGISTIC 
    DATA=have
        PLOTS(ONLY)=NONE
    ;
    MODEL Y (Event = '1') = x1  /
        SELECTION=NONE
        LINK=LOGIT
    ;
    OUTPUT OUT=fitted_model
        PREDICTED = y_hat   
        PREDPROBS=INDIVIDUAL;
RUN;
QUIT;

what I'm getting as output is the predicted probability but what I would like to get is the prediction of whether y_hat was a '1' or '0' - is this possible to do in SAS?


Solution

  • Logistic regression generates a probability. You usually convert those to a predicted 0/1 by using a user defined cutoff, ie if the Probability > 0.7 then you would assign it to 1. Once you specify the cutoffs you can then use a data step to identify it.

    To identify a good cutoff, I recommend the CTABLE and PPROB options.

    data Neuralgia;
       input Treatment $ Sex $ Age Duration Pain $ @@;
       datalines;
    P  F  68   1  No   B  M  74  16  No  P  F  67  30  No
    P  M  66  26  Yes  B  F  67  28  No  B  F  77  16  No
    A  F  71  12  No   B  F  72  50  No  B  F  76   9  Yes
    A  M  71  17  Yes  A  F  63  27  No  A  F  69  18  Yes
    B  F  66  12  No   A  M  62  42  No  P  F  64   1  Yes
    A  F  64  17  No   P  M  74   4  No  A  F  72  25  No
    P  M  70   1  Yes  B  M  66  19  No  B  M  59  29  No
    A  F  64  30  No   A  M  70  28  No  A  M  69   1  No
    B  F  78   1  No   P  M  83   1  Yes B  F  69  42  No
    B  M  75  30  Yes  P  M  77  29  Yes P  F  79  20  Yes
    A  M  70  12  No   A  F  69  12  No  B  F  65  14  No
    B  M  70   1  No   B  M  67  23  No  A  M  76  25  Yes
    P  M  78  12  Yes  B  M  77   1  Yes B  F  69  24  No
    P  M  66   4  Yes  P  F  65  29  No  P  M  60  26  Yes
    A  M  78  15  Yes  B  M  75  21  Yes A  F  67  11  No
    P  F  72  27  No   P  F  70  13  Yes A  M  75   6  Yes
    B  F  65   7  No   P  F  68  27  Yes P  M  68  11  Yes
    P  M  67  17  Yes  B  M  70  22  No  A  M  65  15  No
    P  F  67   1  Yes  A  M  67  10  No  P  F  72  11  Yes
    A  F  74   1  No   B  M  80  21  Yes A  F  69   3  No
    ;
    
    proc logistic data=Neuralgia;
       class Treatment Sex / param=ref;
       model Pain= Treatment Sex Treatment*Sex Age Duration / expb ctable pprob=(0.3, 0.5 to 0.8 by 0.1);
    run;