I'm trying to train a light gradient boosting machine classification algorithm (LGBMClassifier). I have three classes in my data, and I assigned them -1, 0 and 1. These labels hold specific meaning, so I would not like to change them to other labels (like 0, 1 and 2).
When I try to train the model, I get the following error message:
Label must be in [0, 3), but found -1 in label
It is a requirement that if I have three labels, they have to be designated 0,1 and 2?
Update #1
Thanks JST99, this is the correct approach. I defined it as below and it worked correctly. Take note, when you run predict, the output is a probability vector of length n_classes, and if you convert this to the class with the highest probability using np.argmax
you will have classes ranging from 0 to n (ie you have to take care to convert this back to your original classes, [-1,0,1] in my case).
hyperparameter_dictionary = {'task': 'train',
'boosting_type': 'gbdt',
'objective': 'multiclass',
'metric': 'multi_logloss',
'num_leaves': 100,
'learning_rate': 0.05,
'feature_fraction': 0.9,
'bagging_fraction': 0.9,
'bagging_freq': 0,
'verbose': -1,
'num_class': 3,
'classes': [-1, 0, 1]
}
model = lgb.LGBMClassifier(**hyperparameter_dictionary)
model.fit(X,y)
According to the docs, you can specify an additional kwarg
argument classes
when initializing the classifier. Specifically, classes
is of
Type: array of shape = [n_classes]
Therefore, we can try something like
clf = lightgbm.LGBMClassifier(..., classes=[-1, 0, 1])