I am running catboost classifier with catboost using settings:
model = CatBoostClassifier(iterations=1000, learning_rate=0.05, depth=7, loss_function='MultiClass',calc_feature_importance=True)
I have 5 classes and it starts from -ve values and increases as below while fitting model:
0: learn: -1.5036342 test: -1.5039740 best: -1.5039740 (0) total: 18s remaining: 4h 59m 46s
1: learn: -1.4185548 test: -1.4191364 best: -1.4191364 (1) total: 37.8s remaining: 5h 14m 24s
2: learn: -1.3475387 test: -1.3482641 best: -1.3482641 (2) total: 56.3s remaining: 5h 12m 1s
3: learn: -1.2868831 test: -1.2877465 best: -1.2877465 (3) total: 1m 15s remaining: 5h 12m 32s
4: learn: -1.2342138 test: -1.2351585 best: -1.2351585 (4) total: 1m 34s remaining: 5h 13m 56s
Is this normal behaviour? While in most of the machine learning algorithms, logloss is positive and decreases with training. What am I missing here?
Yes, this is normal behaviour.
When you specify loss_function='MultiClass'
in parameters of your model, it uses another loss function, not LogLoss, for optimisation. The definition can be found here.
To understand the sign of that function, you can think of the best-case scenario and the worst-case scenario. In the best-case the value of the target function for object ai is all concentrated in the correct class t, so the fraction in the log (in the formula on the linked page) would equal 1, and the log would be 0. However, when you diverge from that best-case, the fraction in the log would decrease towards 0, and the log itself would get more and more negative.