Search code examples
rinteractioninterpretation

Calculate by hand Fitted Values of an Interaction from a regression output


I am working with an interaction model similar to this one below:

set.seed(1993)

moderating <- sample(c("Yes", "No"),100, replace = T)
x <- sample(c("Yes", "No"), 100, replace = T)
y <- sample(1:100, 100, replace = T)

df <- data.frame(y, x, moderating)

Results <- lm(y ~ x*moderating)
summary(Results)
Call:
lm(formula = y ~ x * moderating)

Residuals:
    Min      1Q  Median      3Q     Max
-57.857 -29.067   3.043  22.960  59.043

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)         52.4000     6.1639   8.501 2.44e-13 ***
xYes                 8.4571     9.1227   0.927    0.356    
moderatingYes      -11.4435     8.9045  -1.285    0.202    
xYes:moderatingYes  -0.1233    12.4563  -0.010    0.992    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 30.82 on 96 degrees of freedom
Multiple R-squared:  0.04685, Adjusted R-squared:  0.01707
F-statistic: 1.573 on 3 and 96 DF,  p-value: 0.2009

I'm learning how to calculate the fitted value of a interaction from a regression table. In the example, the base category (or omitted category) is x= No and moderating = No.

Thus far, I know the following fitted values:

#Calulate Fitted Value From a Regression Interaction by hand
#Omitted Variable = X_no.M_no

X_no.M_no <- 52.4000
X_yes.M_no <- 52.4000 + 8.4571
X_no.M_yes <- 52.4000 + -11.4435
X_yes.M_yes #<- ?

I do not understand how the final category, X_yes.M_yes, is calculated. My initial thoughts were X_yes.M_yes <- 52.4000 + -0.1233, (the intercept plus the interaction term) but that is incorrect. I know its incorrect because, using the predict function, the fitted value of X_yes.M_yes = 49.29032, not 52.4000 + -0.1233 = 52.2767.

How do I calculate, by hand, the predicted value of the X_yes.M_yes category?

Here are the predicted values as generated from the predict function in R

#Validated Here Using the Predict Function:
newdat <- NULL
for(m in na.omit(unique(df$moderating))){
  for(i in na.omit(unique(df$x))){
    moderating <- m
    x <- i
   
    newdat<- rbind(newdat, data.frame(x, moderating))
   
  }
}

Prediction.1 <- cbind(newdat, predict(Results, newdat, se.fit = TRUE))
Prediction.1

Solution

  • Your regression looks like this in math:

    hat_y = a + b x + c m + d m x
    

    Where x = 1 when "yes" and 0 when "no" and m is similarly defined by moderating.

    Then X_yes.M_yes implies x = 1 and m = 1, so your prediction is a + b + c + d.

    or in your notation X_yes.M_yes = 52.4000 + 8.4571 - 11.4435 - 0.1233