Search code examples
rlinear-regression

Perform multiple linear regression analysis including interaction terms, interpret results using summary() and diagnostic plots using lm()


I tried to perform a multiple linear regression analysis with code like this one but with no success. I tried to do it with lm() function. I think there is a problem with the 'x1*x2'.

data <- data.frame(x1 = rnorm(100), x2 = rnorm(100), y = rnorm(100))
model <- lm(y ~ x1 + x2 + x1*x2)
summary(model)
plot(model)

It shows me error. What should I do?


Solution

  • The error did not occur because of your interaction term. When testing it, that worked perfectly for me. You forgot to specify the data. The lm() function requires you to provide the data your variables should stem from. In the code below I also shortened the code within the function because x1*x2 is already sufficient. R detects that you have an interaction term, so you don't have to repeat the same variable names.

    data <- data.frame(x1 = rnorm(100), x2 = rnorm(100), y = rnorm(100)) 
    model <- lm(y ~ x1*x2,
                data= data) 
    summary(model)
    #> 
    #> Call:
    #> lm(formula = y ~ x1 * x2, data = data)
    #> 
    #> Residuals:
    #>      Min       1Q   Median       3Q      Max 
    #> -2.21772 -0.77564  0.06347  0.56901  2.15324 
    #> 
    #> Coefficients:
    #>             Estimate Std. Error t value Pr(>|t|)  
    #> (Intercept) -0.05853    0.09914  -0.590   0.5564  
    #> x1           0.17384    0.09466   1.836   0.0694 .
    #> x2          -0.02830    0.08646  -0.327   0.7442  
    #> x1:x2       -0.00836    0.07846  -0.107   0.9154  
    #> ---
    #> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    #> 
    #> Residual standard error: 0.9792 on 96 degrees of freedom
    #> Multiple R-squared:  0.03423,    Adjusted R-squared:  0.004055 
    #> F-statistic: 1.134 on 3 and 96 DF,  p-value: 0.3392
    

    Created on 2023-01-14 with reprex v2.0.2