Search code examples
rregressionglmnetlasso-regression

Lasso regression glmnet assigning Y value.


Okay so I'm doing a LASSO regression but I'm having problems with my Y term. I know my X has to be a matrix and the y's have to be numeric. This is the case in my set. However I feel my model does not run properly. I first show you what I did and then what I think should be done (but no idea how to do it).

So what I did is as follows. I used the nuclear dataset from R for this example.

library(boot)
data("nuclear")
attach(nuclear)
nuclear <- as.matrix(nuclear)

So I converted it to a matrix. And then I used my matrix on x and y.

CV = cv.glmnet(x=nuclear,y=nuclear, family="multinomial", type.measure = "class", alpha = 1, nlambda = 100)

However i feel my Y-axis is not correct. I feel somehow my dependent variable should be there. But how do I get it there? Assume that nuclear$pt is my dependent variable. Putting nuclear$pt for Y does not work.

plot(CV)

fit = glmnet(x=nuclear, y=nuclear, family = "multinomial" , alpha=1, lambda=CV$lambda.1se)

If i then run this it feels my model didn't run at all. Probably something miss with my Y but i can't put my finger on it.


Solution

  • You used the same matrix for x and y. You have to separate the independent and dependent variables somehow. For example, you can use indices to select the variables:

    cv.glmnet(x=nuclear[, 1:10],y=nuclear[, 11], family="binomial", 
       type.measure = "class", alpha = 1, nlambda = 100)
    

    This will use the first 10 columns of nuclear as independent variables and the 11th column as dependent variable.