Search code examples
rregressionglmnet

predict.glmnet() gives same predictions for type = "link" and "response" using family = "binomial"


Take this case (classic crab data for logistic regression):

> library(glmnet)
> X <- read.table("http://www.da.ugent.be/datasets/crab.dat", header=T)[1:10,]
> Y <- factor(ifelse(X$Sa > 0, 1, 0))
> Xnew <- data.frame('W'=20,'Wt'=2000)
> fit.glmnet <- glmnet(x = data.matrix(X[,c('W','Wt')]), y = Y, family = "binomial")

Now I want to predict new values from Xnew:

According to the docs I can use predict.glmnet:

type

Type of prediction required. Type "link" gives the linear predictors for "binomial", "multinomial", "poisson" or "cox" models; for "gaussian" models it gives the fitted values. Type "response" gives the fitted probabilities for "binomial" or "multinomial", [...]

So this is what I do:

> predict.glmnet(object = fit.glmnet, type="response", newx = as.matrix(Xnew))[,1:5]
        s0         s1         s2         s3         s4 
-0.8472979 -0.9269763 -1.0057390 -1.0836919 -1.1609386 
> predict.glmnet(object = fit.glmnet, type="link", newx = as.matrix(Xnew))[,1:5]
        s0         s1         s2         s3         s4 
-0.8472979 -0.9269763 -1.0057390 -1.0836919 -1.1609386 

Same values for both link as response predictions, which is not what I expect. Using predict seems to give me the correct values:

> predict(object = fit.glmnet, type="response", newx = as.matrix(Xnew))[,1:5]
       s0        s1        s2        s3        s4 
0.3000000 0.2835386 0.2678146 0.2528080 0.2384968 
> predict(object = fit.glmnet, type="link", newx = as.matrix(Xnew))[,1:5] 
        s0         s1         s2         s3         s4 
-0.8472979 -0.9269763 -1.0057390 -1.0836919 -1.1609386 

Is this a bug, or am I using predict.glmnet in a wrong way?


Solution

  • Within the packet glmnet, your object is of class lognet:

    > class(object)
    [1] "lognet" "glmnet"
    

    That's why you are not getting the right result with predict.glmnet, which internally does not support type="response", but you will get it if you use predict.lognet:

    > predict.lognet(object = fit.glmnet, newx = as.matrix(Xnew), type="response")[,1:5]
           s0        s1        s2        s3        s4 
    0.3000000 0.2835386 0.2678146 0.2528080 0.2384968 
    > predict.lognet(object = fit.glmnet, newx = as.matrix(Xnew), type="link")[,1:5]
            s0         s1         s2         s3         s4 
    -0.8472979 -0.9269763 -1.0057390 -1.0836919 -1.1609386 
    

    Anyway I would recommend you that you use predict, and let R resolve internally which function to use.

    Hope it helps.