I am trying to evaluate my XGBoost binary classification model. Using the sklearn
wrapper it is quiet simple:
sk_model = xgb.XGBClassifier()
sk_model.fit(X_train, y_train)
sk_model.score(X_test, y_test)
I am trying to do the same thing without the wrapper (but still using the sklearn
accuracy_score()
function):
from sklearn.metrics import accuracy_score
DTrain = xgb.DMatrix(X_train, label=y_train)
DTest = xgb.DMatrix(X_test)
params = {"eta":0.3, "objective":"binary:hinge"}
model = xgb.train(params, DTrain, num_boost_round=50)
accuracy_score(model.predict(DTest), y_test)
Is there a simpler way to do that something like model.score(Dtest, y_test)
? I also assumed that it didn't make a difference for the final prediction whether I use binary:hinge
as my objective function or use binary:logistic
and then set a threshold to 50%
. Is this true or is there a difference?
Thanks!
Is there a simpler way to do that something like
model.score(Dtest, y_test)
?
Not to my knowledge.
I also assumed that it didn't make a difference for the final prediction whether I use binary:hinge as my objective function or use binary:logistic and then set a threshold to 50%. Is this true or is there a difference?
This is incorrect. Changing the loss (to hinge) affects the targets each tree is approximating; the gradients in "gradient boosting" (and the hessians, in xgboost's extension) are those of the objective/loss function.