I'm developing a fraud detection model using XGBoost.
I cannot share the data (Sorry)
The CPU based model works well and identifies frauds as expected.
The GPU based model identifies a lower number of frauds.
So, given the same level of confidence the GPU based model identifies a much lower number of frauds.
This is the parameters list for the CPU:
params = {"objective":"multi:softprob",
'booster':'dart',
'max_depth':5,
'eta':0.1,
'subsample':0.2,
'nthread':mp.cpu_count()-1,
'eval_metric':'merror',
'colsample_bytree':0.2,
'num_class':2}
The parameters for the GPU model training are:
params = {"objective":"multi:softprob",
'subsample':0.2,
'gpu_id':0,
'num_class':2,
'tree_method':'gpu_hist',
'max_depth':5,
'eta':0.1,
'gamma':1100,
'eval_metric':'mlogloss'}
It is due to usage of different "tree" paramaters. Most probably it is using tree_method='exact'
when using the CPU as you haven't given a tree parameter explicitly. You can test this by adding tree_method='exact'
to your CPU params list and check whether you are getting a good accuracy same as without it. But you are using tree_method='gpu_hist'
when using the GPU. You can find more information on all tree methods at here