Search code examples
pythontensorflowkeras

When using TensorFLow / Keras, what is the most efficient way to get some metadata inside a custom cost function?


In my dataset, I have a binary Target column, some Features columns, and a Date column. I want to write a custom cost function that would first compute a cost-by-date quantity, then add all the costs up. But to do this, I would need to know inside the cost function the corresponding date for each data point in y_pred and y_true.

What would be the best way to do this to maximize performance? I have a couple of ideas:

  • Make the target variable a tuple (target, date), have a custom first layer that extracts the first entry of the tuple, and have the cost function extract the second entry of the tuple y_true
  • Make the target column variable an index, and have the custom first layer as well as the custom cost function pull the relevant values from a global variable based on index

What is the most efficient way to get this information inside the custom cost function?


Solution

  • I just found a way you could do that. Im not quite sure how performant that would be but one way would be to use a CustomLoss of the following form

    def myLossWithDate(date_col):
        def customBinaryCrossEntropy(y_true, y_pred):
            print(list(zip(date_col, y_true.numpy())))
            # do smth here
            # return custom_loss or
            return tf.keras.losses.binary_crossentropy(y_true,y_pred)
        return customBinaryCrossEntropy
        
    

    You can then use this loss in your model like so:

    mod = tf.keras.models.Sequential([
        tf.keras.layers.Dense(1, activation="sigmoid")
    ])
    mod.compile(optimizer="sgd", loss=myLossWithDate(date_col=X[:,1]), run_eagerly=True)
    mod.fit(X, Y, epochs=1, verbose=False)
    

    The main thing here is to use

    run_eagerly=True
    

    Otherwise you would get Iterator Tensors (https://www.tensorflow.org/guide/intro_to_graphs). Depending on the data the output, due to the print(list(zip(...))) thingy looks like so

    [(1, array([0])), (2, array([1])), (3, array([1]))]
    

    where I used

    Y = np.random.binomial(1, 0.5, 3).reshape(-1,1)
    X = np.column_stack((np.array([1,2,3]), np.array([1,2,3]))) # data, date as int
    

    as data.

    Obviously this is just a dummy but maybe it will help you.

    EDIT: Using minibatches

    The function changes as follows

    
    def myLossWithDate():
        def customBinaryCrossEntropy(y_true, y_pred):
            y_true_ = y_true[:,0]
            batch_size = y_true.shape[0]
            y_true_ = tf.reshape(y_true_, shape=(batch_size, 1))
            date_col = y_true[:,1]
            # do smth here
            # return custom_loss or
            return tf.keras.losses.binary_crossentropy(y_true_,y_pred)
        return customBinaryCrossEntropy
    

    and pass

    Y = np.column_stack((Y, date_col))
    

    Since in backprop you usually do not use Y except for calculating the loss which you will do manually.

    The model becomes

    batches = 2
    batch_size = int(X.shape[0] / batches)
    
    mod = tf.keras.models.Sequential([
        tf.keras.layers.Dense(1, activation="sigmoid")
    ])
    mod.compile(optimizer="sgd", loss=myLossWithDate(), run_eagerly=True)
    mod.fit(X, Y, epochs=1, verbose=False, batch_size=batch_size)