Search code examples
rdataframeglmnetcoefficients

Coerce model coefficients to clean, 2-column dataframe


I am fitting an elastic net with cross-validation and I am looking at how big the coefficients are for each predictor:

lambda <- cv.glmnet(x = features_training, y = outcomes_training, alpha = 0)
elnet <- lambda$glmnet.fit
coefs <- coef(elnet, s = lambda$lambda.min, digits = 3)

The coefs variable contains a dgCMatrix:

                           1
(Intercept)    -1.386936e-16
ret             4.652863e-02
ind30          -2.419878e-03
spyvol          1.570406e-02

Is there a quick way to turn this into a dataframe with 2 columns (one for the predictor name and the other for the coefficient value)? as.data.frame, as.matrix or chaining both did not work. I would notably like to sort the rows according to the second column.


Solution

  • Another way, and no hacks through attributes() function, but extracting the rownames and matrix values. The attributes(class(coefs)) informs that dgCMatrix is a sparse matrix created using Matrix package.

    data.frame( predict_names = rownames(coefs),
                coef_vals = matrix(coefs))
    
    # predict_names    coef_vals
    # 1    (Intercept) 21.117339411
    # 2    (Intercept)  0.000000000
    # 3            cyl -0.371338786
    # 4           disp -0.005254534
    # 5             hp -0.011613216
    # 6           drat  1.054768651
    # 7             wt -1.234201216
    # 8           qsec  0.162451314
    # 9             vs  0.771959823
    # 10            am  1.623812912
    # 11          gear  0.544171362
    # 12          carb -0.547415029