python numpy machine-learning scikit-learn loss

how to compute binary log loss per sample of scikit-learn ML model

I am attempting to apply binary log loss to Naive Bayes ML model I created. I generated a categorical prediction dataset (yNew) and a probability dataset (probabilityYes), and can't successfully run them in a log loss function.

Simple sklearn.metrics function give a single log-loss result- not sure how to interpret this

from sklearn.metrics import log_loss
ll = log_loss(yNew, probabilityYes, eps=1e-15)
print(ll)
.0819....

more complex function returns a value of 2.55 for each NO and 2.50 for each YES (total of 90 columns)- again, no idea how to interpret this

def logloss(yNew,probabilityYes):
epsilon = 1e-15
probabilityYes = sp.maximum(epsilon, probabilityYes)
probabilityYes = sp.minimum(1-epsilon, probabilityYes)

#compute logloss function (vectorised)
ll = sum(yNew*sp.log(probabilityYes) +
            sp.subtract(1,yNew)*sp.log(sp.subtract(1,probabilityYes)))
ll = ll * -1.0/len(yNew)
return ll

print(logloss(yNew,probabilityYes))
2.55352047 2.55352047 2.50358354 2.55352047 2.50358354 2.55352047 .....

Solution

Here is how you can compute the loss per sample:

import numpy as np

def logloss(true_label, predicted, eps=1e-15):
  p = np.clip(predicted, eps, 1 - eps)
  if true_label == 1:
    return -np.log(p)
  else:
    return -np.log(1 - p)

Let's check it with some dummy data (we don't actually need a model for this):

predictions = np.array([0.25,0.65,0.2,0.51,
                        0.01,0.1,0.34,0.97])
targets = np.array([1,0,0,0,
                   0,0,0,1])

ll = [logloss(x,y) for (x,y) in zip(targets, predictions)]
ll
# result:
[1.3862943611198906,
 1.0498221244986778,
 0.2231435513142097,
 0.7133498878774648,
 0.01005033585350145,
 0.10536051565782628,
 0.41551544396166595,
 0.030459207484708574]

From the array above, you should be able to convince yourself that the farther a prediction is from the corresponding true label, the greater the loss, as we would expect intuitively.

Let's just confirm that the computation above agrees with the total (average) loss as returned by scikit-learn:

from sklearn.metrics import log_loss

ll_sk = log_loss(targets, predictions)
ll_sk
# 0.4917494284709932

np.mean(ll)
# 0.4917494284709932

np.mean(ll) == ll_sk
# True

Code adapted from here [link is now dead].