Search code examples
rgraphregressionglmnet

How do I superimpose lasso and ridge regression fits (Glmnet) onto data?


I have data (below), and have carried out linear, ridge, and lasso regression. For lasso and ridge regression I have found the optimal lambda using cross validation. I now want to superimpose the fitted models onto a y vs x plot of my original data. I have the linear model on the graph, I just can't figure out how to get the other two to appear. I have attempted it in ggplot, but an answer in base R would be really helpful too! Even if you could point me in the right direction, that would be great.

I have the models all working fine. I have the linear regression line on the plot. However when I try to plot the other two fits in the same way it won't work.

code to create the data

set.seed(1)
x <- rnorm(100)
y <- 1 + .2*x+3*x^2+.6*x^3 + rnorm(100)
d <- data.frame(x=x,y=y)
d$x2 <- d$x^2
d$x3 <- d$x^3
d$x4 <-d$x^4
d$x5 <-d$x^5

linear regression

f <- lm(y ~ ., data=d)

ridge regression

library(glmnet) 
x <- model.matrix(y ~ ., data=d)
y <- d$y

grid <- 0.001:50
ridge.fit <- glmnet(x,y,alpha=0, lambda = grid)

cv <- cv.glmnet(x,y)
r.fit.new <-  glmnet(x,y,alpha=0, lambda = cv$lambda.min)

lasso

lasso.fit <- glmnet(x,y,alpha=1, lambda = grid) 
l.fit.new <- glmnet(x,y,alpha=1, lambda = cv$lambda.min)

graph

ggplot(data=d, aes(x=x, y=y)) + geom_point() + geom_line(aes(y=fitted(f)), colour="blue") 

Solution

  • changed your code for creating the data a bit

    set.seed(1)
    x <- rnorm(100)
    y <- 1 + .2*x+3*x^2+.6*x^3 + rnorm(100)
    d <- data.frame(x.values=x,y=y)
    d$x2 <- d$x.values^2
    d$x3 <- d$x.values^3
    d$x4 <-d$x.values^4
    d$x5 <-d$x.values^5
    

    the rest of your code for creating the model matrix and doing the models as it is.

    Some munging to format the data for plotting

    library(dplyr)
    data.for.plot <- d%>%
    select(x.values,y) %>%
    mutate(fitted_lm = as.numeric(fitted(f)),
    fitted_ridge_lm = as.numeric(predict(r.fit.new, newx= x)),
    fitted_lasso_lm = as.numeric(predict(l.fit.new, newx= x)))
    
    #Plot
    ggplot(data.for.plot, aes(x = x.values, y = y)) + 
      geom_point() + 
      geom_line(aes(y=fitted_lm), colour="blue") + 
      geom_line(aes(y=fitted_ridge_lm), colour="red") + 
      geom_line(aes(y= fitted_lasso_lm),color="grey75") + theme_bw()
    

    enter image description here

    Now you will notice it's hard to see the fits as they are pretty close to each other (great the models agree). So let's format the data a bit and use faceting in ggplot to see the fits individually

    library(tidyr)
    data.for.plot.long <- gather(data.for.plot, key= fit_type, value = fits, -x.values,-y)
    ggplot(data.for.plot.long, aes(y = y, x = x.values)) +
        geom_point() + 
        geom_line(aes(y = fits,colour=fit_type))+facet_wrap(~fit_type, ncol = 1,scales = "free") + theme_bw()
    

    The resultant plot: enter image description here