Search code examples
rlm

R lm Capture interaction terms, but not categorical variable


I would like to estimate the following regression model: y = b0 + b1 * X + b2 * x * dummy

where y and x are continuous, and dummy is a categorical (dummy variable).

In other words, I would like my estimated model to estimate three coefficients: bo, b1, and b2.

I have tried the following ...

lm(y ~ x + x * dummy, data)

but it adds the variable dummy in the model and estimates the coefficient of dummy.

The following comes close to what I want to do, but it converts the interaction term to a binary variable (true/false).

lm(y ~ x + I(!x * dummy), data)

For replication consider the following example:

data <- tibble(y=rnorm(10), x=runif(10), dummy=ifelse(x>.5,1,0))
lm(y ~ x + x * dummy, data)
lm(y ~ x + I(!x * dummy), data)

Thanks


Solution

  • Here:

    > summary(lm(y ~ x+ x : dummy, data))
    
    Call:
    lm(formula = y ~ x + x:dummy, data = data)
    
    Residuals:
         Min       1Q   Median       3Q      Max 
    -0.61312 -0.15558 -0.00354  0.23965  0.47351 
    
    Coefficients:
                Estimate Std. Error t value Pr(>|t|)
    (Intercept)  0.06755    0.36162   0.187    0.857
    x            0.94953    1.18299   0.803    0.449
    x:dummy     -1.10220    0.88112  -1.251    0.251
    
    Residual standard error: 0.4148 on 7 degrees of freedom
    Multiple R-squared:  0.2645,    Adjusted R-squared:  0.05438 
    F-statistic: 1.259 on 2 and 7 DF,  p-value: 0.3412