Search code examples
rglmnetcoefficientsdummy-variable

Dummy coefficients back to factor


Say I have a trained glmnet model on a sparse matrix with multiple predictors containing a varying level of factors (and consequently varying number of dummy predictors).

df <- data.frame(y=runif(10), catVar=as.factor(sample(0:5,10,TRUE)))
A <- model.matrix(y ~ catVar,df) 
train <- cv.glmnet(A[,c('catVar3', 'catVar4')], df$y)
coef(train, s="lambda.min")

What would be the best (most efficient) approach to converting the dummy coefficients/values, or the overall formula, as if the dummy columns were not in a sparse format (just one column of varying factors)?

EDIT: I'm needing to convert the dummy coefficients and their slopes/values back to individual coefficients with varying slopes for each level.


Solution

  • Adapting a slick example from the mailing list,

    n <- length(levels(df$catVar))
    factor(A%*%1:n, labels = levels(df$catVar))