Okay so I'm doing a LASSO regression but I'm having problems with my Y term. I know my X has to be a matrix and the y's have to be numeric. This is the case in my set. However I feel my model does not run properly. I first show you what I did and then what I think should be done (but no idea how to do it).
So what I did is as follows. I used the nuclear dataset from R for this example.
library(boot)
data("nuclear")
attach(nuclear)
nuclear <- as.matrix(nuclear)
So I converted it to a matrix. And then I used my matrix on x and y.
CV = cv.glmnet(x=nuclear,y=nuclear, family="multinomial", type.measure = "class", alpha = 1, nlambda = 100)
However i feel my Y-axis is not correct. I feel somehow my dependent variable should be there. But how do I get it there? Assume that nuclear$pt is my dependent variable. Putting nuclear$pt for Y does not work.
plot(CV)
fit = glmnet(x=nuclear, y=nuclear, family = "multinomial" , alpha=1, lambda=CV$lambda.1se)
If i then run this it feels my model didn't run at all. Probably something miss with my Y but i can't put my finger on it.
You used the same matrix for x and y. You have to separate the independent and dependent variables somehow. For example, you can use indices to select the variables:
cv.glmnet(x=nuclear[, 1:10],y=nuclear[, 11], family="binomial",
type.measure = "class", alpha = 1, nlambda = 100)
This will use the first 10 columns of nuclear as independent variables and the 11th column as dependent variable.