Custom Performance Function in caret Package using predicted Probability

This SO post is about using a custom performance measurement function in the caret package. You want to find the best prediction model, so you build several and compare them by calculating a single metric that is drawn from comparing the observation and the predicted value. There are default functions to calculate this metric, but you can also define your own metric-function. This custom functions must take obs and predicted values as input.

In classification problems (let's say only two classes) the predicted value is 0 or 1. However, what I need to evaluate is also the probability calculated in the model. Is there any way to achieve this?

The reason is that there are applications where you need to know whether a 1 prediction is actually with a 99% probability or with a 51% probability - not just if the prediction is 1 or 0.

Can anyone help?

Edit OK, so let me try to explain a little bit better. In the documentation of the caret package under 5.5.5 (Alternate Performance Metrics) there is a description how to use your own custom performance function like so

fitControl <- trainControl(method = "repeatedcv",
                           number = 10,
                           repeats = 10,
                           ## Estimate class probabilities
                           classProbs = TRUE,
                           ## Evaluate performance using 
                           ## the following function
                           summaryFunction = twoClassSummary)

twoClassSummary is the custom performance function in this example. The function provided here needs to take as input a dataframe or matrix with obs and pred. And here's the point - I want to use a function that does not take observerd and predicted, but observed and predicted probability.

One more thing:

Solutions from other packages are also welcome. The only thing I am not looking for is "This is how you write your own cross-validation function."

Solution

Caret does support passing class probabilities to custom summary functions when you specify classProbs = TRUE in trainControl. In that case the data argument when creating a custom summary function will have additional two columns named as classes containing the probability of each class. Names of these classes will be in the lev argument which is a vector of length 2.

See the Example:

library(caret)
library(mlbench)
data(Sonar)

Custom summary LogLoss:

LogLoss <- function (data, lev = NULL, model = NULL){ 
  obs <- data[, "obs"] #truth
  cls <- levels(obs) #find class names
  probs <- data[, cls[2]] #use second class name to extract probs for 2nd clas
  probs <- pmax(pmin(as.numeric(probs), 1 - 1e-15), 1e-15) #bound probability, this line and bellow is just logloss calculation, irrelevant for your question 
  logPreds <- log(probs)        
  log1Preds <- log(1 - probs)
  real <- (as.numeric(data$obs) - 1)
  out <- c(mean(real * logPreds + (1 - real) * log1Preds)) * -1
  names(out) <- c("LogLoss") #important since this is specified in call to train. Output can be a named vector of multiple values. 
  out
}

fitControl <- trainControl(method = "cv",
                           number = 5,
                           classProbs = TRUE,
                           summaryFunction = LogLoss)


fit <-  train(Class ~.,
             data = Sonar,
             method = "rpart", 
             metric = "LogLoss" ,
             tuneLength = 5,
             trControl = fitControl,
             maximize = FALSE) #important, depending on calculated performance measure

fit
#output
CART 

208 samples
 60 predictor
  2 classes: 'M', 'R' 

No pre-processing
Resampling: Cross-Validated (5 fold) 
Summary of sample sizes: 166, 166, 166, 167, 167 
Resampling results across tuning parameters:

  cp          LogLoss  
  0.00000000  1.1220902
  0.01030928  1.1220902
  0.05154639  1.1017268
  0.06701031  1.0694052
  0.48453608  0.6405134

LogLoss was used to select the optimal model using the smallest value.
The final value used for the model was cp = 0.4845361.

Alternatively use the lev argument which contains the class levels and define some error checking

LogLoss <- function (data, lev = NULL, model = NULL){ 
 if (length(lev) > 2) {
        stop(paste("Your outcome has", length(lev), "levels. The LogLoss() function isn't appropriate."))
    }
  obs <- data[, "obs"] #truth
  probs <- data[, lev[2]] #use second class name
  probs <- pmax(pmin(as.numeric(probs), 1 - 1e-15), 1e-15) #bound probability
  logPreds <- log(probs)        
  log1Preds <- log(1 - probs)
  real <- (as.numeric(data$obs) - 1)
  out <- c(mean(real * logPreds + (1 - real) * log1Preds)) * -1
  names(out) <- c("LogLoss")
  out
}

Check out this section of caret book: https://topepo.github.io/caret/model-training-and-tuning.html#metrics

for additional info. Great book to read if you plan on using caret and even if you are not its a good read.