Search code examples
rlinear-regression

When using the summary() function on a Linear Regression Model that has one of the variables as a factor, one of the factors goes missing


I have a dataframe that contains a category "AIEasyToUseFactor", whose data is a Likert Scale ranging from Strongly agree to Strongly disagree.

When I try to create a LRM for its relation to how many times AI has been used in the previous month (a category called "HowManyTimesUsedLastMonth"), and then try to produce a summary of it, the "agree" factor from AIEasyToUseFactor goes missing. Any ideas on why this is?

The code:

levels(df$AIEasyToUseFactor) <- c("Agree", "Disagree", "Neither", "Str.Agree", "Str.Disagree")

linModel <- lm(HowManyTimesUsedLastMonth~AIEasyToUseFactor,data=df)

summary(linModel)

What it produces:

Call:
lm(formula = HowManyTimesUsedLastMonth ~ AIEasyToUseFactor, data = df)

Residuals:
   Min     1Q Median     3Q    Max 
-8.731 -4.922 -1.922  0.493 91.269 

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)    
(Intercept)                      4.922      1.148   4.288 2.69e-05 ***
AIEasyToUseFactorDisagree       -2.547      4.256  -0.598   0.5502    
AIEasyToUseFactorNeither        -3.553      2.203  -1.613   0.1082    
AIEasyToUseFactorStr.Agree       3.810      1.823   2.090   0.0377 *  
AIEasyToUseFactorStr.Disagree    2.442      3.678   0.664   0.5074    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 11.59 on 221 degrees of freedom
  (1 observation deleted due to missingness)
Multiple R-squared:  0.04805,   Adjusted R-squared:  0.03082 
F-statistic: 2.789 on 4 and 221 DF,  p-value: 0.02734

I'm completely new to R and RStudio so I haven't been able to attempt much bug-fixing by myself, I'm afraid.


Solution

  • Hi, This is normal in lm regression function.

    The Agree level is supposed to go missing, because it is taken as the baseline. If it is preserved in the result, the estimate/coefficient for Agree will simply be 0, and f(Agree) = 0 + Intercept.

    ps. all the other levels are compare with Agree to get the estimate, e.g., AIEasyToUseFactorDisagree with estimate of -2.547 means Disagree - Agree = -2.547.