python machine-learning logistic-regression sklearn-pandas cross-entropy

Does binary log loss exclude one part of equation based on y?

Assuming the log loss equation to be:

logLoss=−(1/N)*∑_{i=1}^N (yi(log(pi))+(1−yi)log(1−pi))

where N is number of samples, yi...yiN is the actual value of the dependent variable, and pi...piN is the predicted likelihood from logistic regression

How I am looking at it:

if yi = 0 then the first part yi(logpi) = 0

Alternatively, if yi = 1 then the second part (1−yi)log(1−pi) = 0

So now, depending on the value of y one part of the equation is excluded. Am I understanding this correctly?

My ultimate goal is to understand how to interpret the results of log loss.

Solution

Yes, you are on the right track. Keeping in mind that p_i=P(y_i=1), basically the idea is that the loss function needs to be defined in such a way that it penalizes the tuples for which the prediction does not match the actual label (e.g., when y_i=1 but p_i is low, taken care of by the yi(logpi) part, OR when y_i=0 but p_i is high, taken care of by the (1-yi)log(1-pi) part) and at the same time it should not penalize the tuples much for which the prediction matches the actual label (e.g., when y_i=1 and p_i is high OR when y_i=0 and p_i is low).

The loss function for logistic regression (cross entropy) exactly addresses the above desired property of the loss function, as can be seen from the following figure.