Search code examples
rcross-validationxgboostloss-function

Matching XGBoost eval_metric cross-validation calculations with weights


I am trying to recreate the calculations of the mean and std of the evaluation metrics from xgb.cv. I can demonstrate the issue with some code.

library(xgboost)
library(ModelMetrics)
library(rBayesianOptimization)

First without weights.

data(agaricus.train, package='xgboost')
dt <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
dt.folds <- KFold(as.matrix(agaricus.train$label), 
                              nfolds = 5, 
                              stratified = TRUE, 
                              seed = 23)
cv <- xgb.cv(data = dt, nrounds = 3, nthread = 2, folds = dt.folds, metrics = list("logloss","auc"),
             max_depth = 3, eta = 1, objective = "binary:logistic", prediction = TRUE)
test <- sapply(cv$folds, function(x){
  testSet <- unlist(cv$pred[x])
  test_ll <- logLoss(agaricus.train$label[x], testSet)
  test_ll
})

> cv$evaluation_log$test_logloss_mean
[1] 0.1615132 0.0655742 0.0262498

> mean(test)
[1] 0.02624984

As expected, the last mean logloss from the cv object matches my calculations.

However, adding weights. Changing only the dt declaration line.

dt <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label, weight = 1:length(agaricus.train$label))

> cv$evaluation_log$test_logloss_mean
[1] 0.1372536 0.0509958 0.0219024
> mean(test)
[1] 0.02066699

Now they do not match. What is the xgb.cv function doing differently to calculate the loss metrics? Adding weights changes the calculations for auc as well, and I would suspect any loss metric. How can I change my calculations to match the output?


Solution

  • Partially Solved:

    Using a weighted logloss function results in nearly identical results.

    wLogLoss=function(actual, predicted, weights)
    {
      result=-1/sum(weights)*(sum(weights*(actual*log(predicted)+(1-actual)*log(1-predicted))))
      return(result)
    }
    
    calc <- sapply(cv$folds, function(x){
      testSet <- unlist(cv$pred[x])
      test_ll <- wLogLoss(agaricus.train$label[x], testSet, ww[x])
      test_ll
    })
    
    > mean(calc)
    [1] 0.02190241
    > cv$evaluation_log$test_logloss_mean[3]
    [1] 0.0219024
    > var(calc)*4/5
    [1] 0.00001508648
    > cv$evaluation_log$test_logloss_std[3]^2
    [1] 0.00001508551
    

    Minor differences in the variance still exist. I would still be interested in knowing how exactly the xgboost package uses the weights in reality.