Search code examples
rmachine-learningggplot2r-caret

Plotting ROC curve from two different algorithms using lift in caret


I have a two models like the following:

library(mlbench)
data(Sonar)

library(caret)
set.seed(998)

my_data <- Sonar

fitControl <-
  trainControl(
    method = "boot632",
    number = 10,
    classProbs = T,
    savePredictions = "final",
    summaryFunction = twoClassSummary
  )


modelxgb <- train(
  Class ~ .,
  data = my_data,
  method = "xgbTree",
  trControl = fitControl,
  metric = "ROC"
)

library(mlbench)
data(Sonar)

library(caret)
set.seed(998)

my_data <- Sonar

fitControl <-
  trainControl(
    method = "boot632",
    number = 10,
    classProbs = T,
    savePredictions = "final",
    summaryFunction = twoClassSummary
  )


modelsvm <- train(
  Class ~ .,
  data = my_data,
  method = "svmLinear2",
  trControl = fitControl,
  metric = "ROC"
)

I want to plot the ROC curves for both models on one ggplot.

I am doing the following to generate the points for the curve:

for_lift_xgb = data.frame(Class = modelxgb$pred$obs,  xgbTree = modelxgb$pred$R)

for_lift_svm = data.frame(Class = modelsvm$pred$obs,  svmLinear2 = modelsvm$pred$R)

lift_obj_xgb = lift(Class ~ xgbTree, data = for_lift_xgb, class = "R")
lift_obj_svm = lift(Class ~ svmLinear2, data = for_lift_svm, class = "R")

What would be the easiest way to plot both of these curves on a single plot, and have them in different colors. I would also like to annotate the individual AUC values on the plot.


Solution

  • After building the models you can combine the predictions in a single data frame:

    for_lift = data.frame(Class = modelxgb$pred$obs,
                          xgbTree = modelxgb$pred$R,
                          svmLinear2 = modelsvm$pred$R)
    

    use it to build the lift object using the following:

    lift = lift(Class ~ xgbTree + svmLinear2, data = for_lift, class = "R")
    

    and plot with ggplot:

    library(ggplot)
    
    ggplot(lift$data)+
      geom_line(aes(1-Sp , Sn, color = liftModelVar))+
      scale_color_discrete(guide = guide_legend(title = "method"))
    

    enter image description here

    You can combine and compare many models this way.

    To add auc to the plot you can create a data frame with the models names, the corresponding auc and the coordinates for plotting:

    auc_ano <- data.frame(model = c("xgbTree","svmLinear2"),
                          auc = c(pROC::roc(response = for_lift$Class,
                                            predictor = for_lift$xgbTree,
                                            levels=c("M", "R"))$auc,
                                  pROC::roc(response = for_lift$Class,
                                            predictor = for_lift$svmLinear2,
                                            levels=c("M", "R"))$auc),
                          y = c(0.95, 0.9))
    auc_ano
    #output
           model       auc    y
    1    xgbTree 0.9000756 0.95
    2 svmLinear2 0.5041086 0.90
    

    and pass it to geom_text:

    ggplot(lift$data)+
      geom_line(aes(1-Sp , Sn, color = liftModelVar))+
      scale_color_discrete(guide = guide_legend(title = "method"))+
      geom_text(data = auc_ano, aes(label = round(auc, 4), color = model, y = y), x = 0.1)
    

    enter image description here