Search code examples
pythonclassificationxgboostcatboostimbalanced-data

How can I know which is the positive class value and negative class value for XGBoost?


I am working with an imbalanced dataset where I have a class variable of 2 different values: 0 and 1.

The number of '0' values is 1000 and the number of '1' values is 3000.

For XGBClassifier, LGBMClassifier and CatBoostClassifier I found that there is a parameter called "scale_pos_weight" which enables to modify the weights of the class values:

scale_pos_weight = number_of_negative_values / number_of_positive_values

My question is: how can we know which value of class variable is positive and which negative?


Solution

  • For binary classification imbalanced dataset, always consider positive value to the minority class (class 1) and negative values to the majority class (class 0).

    But you have assumed class 0 as minority class & class 1 as majority class.

    By default value of scale_pos_weight=1 or > 1