I am using LightGBM for a binary classification project. I use the built-in 'logloss' as the loss function. However, I want to use early_stopping to stop the iterations when it yields the highest Precision_Recall AUC value. So I have implemented the following custom eval function:
def f_pr_auc(probas_pred, y_true):
labels=y_true.get_label()
p, r, _ = precision_recall_curve(labels, probas_pred)
score=auc(r,p)
return "pr_auc", score, True
This custom eval function works well and I have updates like the following:
However, the iterations stopped at the lowest logloss value but not at the highest pr_auc value. Is there a way that I can disable logloss evaluation and only evaluate pr_auc?
For imbalanced datasets, the highest pr_auc value may not be achieved at the lowest logloss. So I'd like to stop the iterations when the highest pr_auc is achieved.
With LGB Python API, you have to set in your parameters dictionary the custom
metric option:
params = {
......
'objective': 'binary',
'metric': 'custom',
......
}
gbm = lgb.train(params,
lgb_train,
feval=f_pr_auc,
valid_sets=lgb_eval)