Search code examples
rlogistic-regressionp-valuesignificant-terms

How do I extract variables that have a low p-value in R


I have a logistic model with plenty of interactions in R.

I want to extract only the variables and interactions that are either interactions or just predictor variables that are significant.

It's fine if I can just look at every interaction that's significant as I can still look at which non-significant fields were used to get them.

Thank you.

This is the most I have

broom::tidy(logmod)[,c("term", "estimate", "p.value")]

Solution

  • Here is a way. After fitting the logistic model use a logical condition to get the significant predictors and a regex (logical grep) to get the interactions. These two index vectors can be combined with &, in the case below returning no significant interactions at the alpha == 0.05 level.

    fit <- glm(am ~ hp + qsec*vs, mtcars, family = binomial)
    summary(fit)
    #> 
    #> Call:
    #> glm(formula = am ~ hp + qsec * vs, family = binomial, data = mtcars)
    #> 
    #> Deviance Residuals: 
    #>      Min        1Q    Median        3Q       Max  
    #> -1.93876  -0.09923  -0.00014   0.05351   1.33693  
    #> 
    #> Coefficients:
    #>               Estimate Std. Error z value Pr(>|z|)  
    #> (Intercept)  199.02697  102.43134   1.943   0.0520 .
    #> hp            -0.12104    0.06138  -1.972   0.0486 *
    #> qsec         -10.87980    5.62557  -1.934   0.0531 .
    #> vs          -108.34667   63.59912  -1.704   0.0885 .
    #> qsec:vs        6.72944    3.85348   1.746   0.0808 .
    #> ---
    #> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    #> 
    #> (Dispersion parameter for binomial family taken to be 1)
    #> 
    #>     Null deviance: 43.230  on 31  degrees of freedom
    #> Residual deviance: 12.574  on 27  degrees of freedom
    #> AIC: 22.574
    #> 
    #> Number of Fisher Scoring iterations: 8
    
    alpha <- 0.05
    pval <- summary(fit)$coefficients[,4]
    
    sig <- pval <= alpha
    intr <- grepl(":", names(coef(fit)))
    
    coef(fit)[sig]
    #>         hp 
    #> -0.1210429
    
    coef(fit)[sig & intr]
    #> named numeric(0)
    

    Created on 2022-09-15 with reprex v2.0.2