Search code examples
rxgboostloss-functionmse

XGBoost custom evaluation function causing "cannot coerce type closure to vector of type"


I have tried a number of different things but cannot get rid of this error message. Do not see how my code differs from numerous other scripts.

y_train = train$y
train$y = c()
train= as.matrix(train)

train = xgb.DMatrix(data = train, label = y_train)

MSE = function(yhat,train){

   y = getinfo(train, "label")
   err = mean((y-yhat)^2)
   return(list(metric = "RMSE", value = err))

}

params = list(
  eta = 0.1,
  max_depth = 3,
  tweedie_variance_power = 1.5,
  objective = "reg:tweedie",
  feval = MSE
)

model = xgb.cv(
  data = train,
  nfold = 3,
  params = params,
  nrounds = 2000
)

I get the following error:

  Error in as.character(x) : 
  cannot coerce type 'closure' to vector of type 'character'

I find the traceback a bit odd (see below). I use custom folds and xgb.cv is runnable if I remove the fevl and instead use the built in nloglike eval metric.

  > traceback()
  7: FUN(X[[i]], ...)
  6: lapply(p, function(x) as.character(x)[1])
  5: `xgb.parameters<-`(`*tmp*`, value = params)
  4: xgb.Booster.handle(params, list(dtrain, dtest))
  3: FUN(X[[i]], ...)
  2: lapply(seq_along(folds), function(k) {
   dtest <- slice(dall, folds[[k]])
   dtrain <- slice(dall, unlist(folds[-k]))
   handle <- xgb.Booster.handle(params, list(dtrain, dtest))
   list(dtrain = dtrain, bst = handle, watchlist = list(train = dtrain, 
       test = dtest), index = folds[[k]])
   })
  1: xgb.cv(data = train, folds = folds, params = params, nrounds = 2000)

Any suggestions?


Solution

  • For what you need, passing it in params, through metric would work:

    MSE = function(yhat,train){
       y = getinfo(train, "label")
       err = mean((y-yhat)^2)
       return(list(metric = "MSEerror", value = err))
    }
    params = list(
      eta = 0.1,
      max_depth = 3,
      tweedie_variance_power = 1.5,
      objective = "reg:tweedie",
    eval_metric = MSE
    )
    

    Using an example:

    library(xgboost)
    train = mtcars
    colnames(train)[1] = "y"
    y_train = train$y
    train$y = c()
    train= as.matrix(train)
    train = xgb.DMatrix(data = train, label = y_train)
    
    model = xgb.cv(
      data = train,
      nfold = 3,
      params = params,
      nrounds = 2000
    )
    
    head(model$evaluation_log)
       iter train_MSEerror_mean train_MSEerror_std test_MSEerror_mean
    1:    1            415.5046           20.92919           416.7119
    2:    2            410.6576           20.78001           411.8646
    3:    3            404.9321           20.59901           406.1391
    4:    4            398.2114           20.38003           399.4192
    5:    5            390.3808           20.11609           391.5902
    6:    6            381.3338           19.79950           382.5464
       test_MSEerror_std
    1:          62.18317
    2:          61.77277
    3:          61.28819
    4:          60.71951
    5:          60.05671
    6:          59.29019
    

    There's something weird about passing it through params (you can try outside of params, it will work), can update later when I see how it's passed.