Search code examples
pythontensorflowkerasloss-function

Implementing Binary Cross Entropy loss gives different answer than Tensorflow's


I am implementing the Binary Cross-Entropy loss function with Raw python but it gives me a very different answer than Tensorflow. This is the answer I got from Tensorflow:-

import numpy as np
from tensorflow.keras.losses import BinaryCrossentropy

y_true = np.array([1., 1., 1.])
y_pred = np.array([1., 1., 0.])
bce = BinaryCrossentropy()
loss = bce(y_true, y_pred)
print(loss.numpy())

Output:

>>> 5.1416497230529785

From my Knowledge, the formula of Binary Cross entropy is this:

enter image description here

I implemented the same with raw python as follows:

def BinaryCrossEntropy(y_true, y_pred):
    m = y_true.shape[1]
    y_pred = np.clip(y_pred, 1e-7, 1 - 1e-7)
    # Calculating loss
    loss = -1/m * (np.dot(y_true.T, np.log(y_pred)) + np.dot((1 - y_true).T, np.log(1 - y_pred)))

    return loss

print(BinaryCrossEntropy(np.array([1, 1, 1]).reshape(-1, 1), np.array([1, 1, 0]).reshape(-1, 1)))

But from this function I get loss value to be:

>>> [[16.11809585]]

How can I get the right answer?


Solution

  • There's some issue with your implementation. Here is the correct one with numpy.

    def BinaryCrossEntropy(y_true, y_pred):
        y_pred = np.clip(y_pred, 1e-7, 1 - 1e-7)
        term_0 = (1-y_true) * np.log(1-y_pred + 1e-7)
        term_1 = y_true * np.log(y_pred + 1e-7)
        return -np.mean(term_0+term_1, axis=0)
    
    print(BinaryCrossEntropy(np.array([1, 1, 1]).reshape(-1, 1), 
                             np.array([1, 1, 0]).reshape(-1, 1)))
    [5.14164949]
    

    Note, during the tf. keras model training, it's better to use keras backend functionality. You can implement it, in the same way, using the keras backend utilities.

    def BinaryCrossEntropy(y_true, y_pred): 
        y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
        term_0 = (1 - y_true) * K.log(1 - y_pred + K.epsilon())  
        term_1 = y_true * K.log(y_pred + K.epsilon())
        return -K.mean(term_0 + term_1, axis=0)
    
    print(BinaryCrossEntropy(
        np.array([1., 1., 1.]).reshape(-1, 1), 
        np.array([1., 1., 0.]).reshape(-1, 1)
        ).numpy())
    [5.14164949]