Search code examples
pythonnumpymachine-learningscikit-learnloss

how to compute binary log loss per sample of scikit-learn ML model


I am attempting to apply binary log loss to Naive Bayes ML model I created. I generated a categorical prediction dataset (yNew) and a probability dataset (probabilityYes), and can't successfully run them in a log loss function.

Simple sklearn.metrics function give a single log-loss result- not sure how to interpret this

from sklearn.metrics import log_loss
ll = log_loss(yNew, probabilityYes, eps=1e-15)
print(ll)
.0819....

more complex function returns a value of 2.55 for each NO and 2.50 for each YES (total of 90 columns)- again, no idea how to interpret this

def logloss(yNew,probabilityYes):
epsilon = 1e-15
probabilityYes = sp.maximum(epsilon, probabilityYes)
probabilityYes = sp.minimum(1-epsilon, probabilityYes)

#compute logloss function (vectorised)
ll = sum(yNew*sp.log(probabilityYes) +
            sp.subtract(1,yNew)*sp.log(sp.subtract(1,probabilityYes)))
ll = ll * -1.0/len(yNew)
return ll

print(logloss(yNew,probabilityYes))
2.55352047 2.55352047 2.50358354 2.55352047 2.50358354 2.55352047 .....

Solution

  • Here is how you can compute the loss per sample:

    import numpy as np
    
    def logloss(true_label, predicted, eps=1e-15):
      p = np.clip(predicted, eps, 1 - eps)
      if true_label == 1:
        return -np.log(p)
      else:
        return -np.log(1 - p)
    

    Let's check it with some dummy data (we don't actually need a model for this):

    predictions = np.array([0.25,0.65,0.2,0.51,
                            0.01,0.1,0.34,0.97])
    targets = np.array([1,0,0,0,
                       0,0,0,1])
    
    ll = [logloss(x,y) for (x,y) in zip(targets, predictions)]
    ll
    # result:
    [1.3862943611198906,
     1.0498221244986778,
     0.2231435513142097,
     0.7133498878774648,
     0.01005033585350145,
     0.10536051565782628,
     0.41551544396166595,
     0.030459207484708574]
    

    From the array above, you should be able to convince yourself that the farther a prediction is from the corresponding true label, the greater the loss, as we would expect intuitively.

    Let's just confirm that the computation above agrees with the total (average) loss as returned by scikit-learn:

    from sklearn.metrics import log_loss
    
    ll_sk = log_loss(targets, predictions)
    ll_sk
    # 0.4917494284709932
    
    np.mean(ll)
    # 0.4917494284709932
    
    np.mean(ll) == ll_sk
    # True
    

    Code adapted from here [link is now dead].