Search code examples
rglmnetmodel.matrix

Is there a function to return the matching response vector to model.matrix?


In glmnet() I have to specify the raw X matrix and response vector Y (different than lm where you can specify the model formula). model.matrix() will correctly remove incomplete observations from the X matrix, but it doesn't include the response in the output object. So I will have something like this:

mydf
glmnet(y = mydf$response, x = model.matrix(myformula, mydf)[,-1], ...)

When model.matrix removes observations the y and x dimensions won't match. Is there a function to align y data to x?


Solution

  • Try using model.frame and model.response.

    > d <- data.frame(y=rnorm(3), x=c(1,NA,2), z=c(NA, NA, 1))
    > d
               y  x  z
    1 -0.6257260  1 NA
    2 -0.4979723 NA NA
    3 -1.2233772  2  1
    > form <- y~x
    > mf <- model.frame(form, data=d)
    > model.response(mf)
            1         3
    -0.625726 -1.223377
    > model.matrix(form, mf)
      (Intercept) x
    1           1 1
    3           1 2
    attr(,"assign")
    [1] 0 1
    

    I'm not familiar with glmnet, it might be the case that mf is sufficient, just passing y=mf[1,] and x=mf[-1,].