Custom Loss Function in Keras with Sample Weights

I am new to Tensorflow and Keras. I would like to use sample weights in a custom loss function.

If I understand correctly, this post (Custom loss function with weights in Keras) suggests including weights as an input into the network. As well as this: Custom weighted loss function in Keras for weighing each element

I am wondering if I am missing something (I'd also like to not define weights as a global variable). I am also a bit surprised that there is not a way to use it directly, since the Loss class _ _ call _ _ method accepts sample_weight as an argument but if I understand correctly the loss function must have only arguments y_true, and y_pred.

From the documentation (, however:

Creating custom losses Any callable with the signature loss_fn(y_true, y_pred) that returns an array of losses (one of sample in the input batch) can be passed to compile() as a loss. Note that sample weighting is automatically supported for any such loss.

It sounds like one should be able to use sample weighting through the, sample_weight=sample_weight) method.

In this post (Should the custom loss function in Keras return a single loss value for the batch or an arrary of losses for every sample in the training batch? ) there is a lengthy discussion about the size of the output of the loss function.

And, lastly, it is also mentioned that when a custom loss function is created, then, an array of losses (individual sample losses) should be returned. Their reduction is handled by the framework.

It seems to me that if the custom_loss(y_true, y_pred) returns a tensor of size (batch_size, ) then one ought to be able to use sample_weight in fit method. What am I missing?

Thanks a lot for any help!

Code snippets:

class NegLogLikMixedGaussian(Loss):
    Negative Log-Likelihood of Mixed Gaussian with:
        num_components: number of components
        mu: means of the Gaussian components
        sg: standard deviations of the Gaussian components

    def __init__(self, num_params=NUM_PARAMS_MG,
                 num_components=2, name='neg_log_lik_mixed_gaussian'):
        super(NegLogLikMixedGaussian, self).__init__(name=name)
        self.num_params = num_params
        self.num_components = num_components

    def call(self, y_true, p_predict):
        Rem: for MDN the output of the networks are _parameters_ of the
        predicted distribution, _not_ point-estimates

        y_true: (batch_size, 1)
            Observed value of the random variable
        p_predict: (batch_size, num_components)
            Output parameters of the network given some input

        Negative log likelihood of the batch (batch_size, 1)

        alpha, mu, sg = tf.split(p_predict,
                                 num_or_size_splits=self.num_params, axis=1)
        gm = tfd.MixtureSameFamily(
            components_distribution=tfd.Normal(loc=mu, scale=sg))
        log_likelihood = tf.transpose(gm.log_prob(tf.transpose(y_true)))
        return -tf.reduce_mean(log_likelihood, axis=-1)

My hope was then to be able to use:

                      num_components=2, num_params=3))


# For testing purposes
sample_weight = np.ones(len(y_train)) / len(dh.y_train_scaled)  # this should give same results as un-weighted

# Some non-trivial weights
sample_weights = np.zeros(len(y_train))
sample_weights[:5] = 1
# This will give me same results as above, y_train, sample_weight=sample_weight,
                      batch_size=128, epochs=10)


  • Your code is correct, except for a few details, if I understood what you want to do. The sample weights should be of dimension (number of samples,) though the loss should be of dimension (batch_size,). The sample weights can be passed to the fit method and it seems to work. In your custom loss class, num_components and num_params are initialized but only one of the two parameters is used in the call method. I'm not sure I understood the dimensions of the tensor (alpha, mu, sg), is it of dimension (batch_size, 3, num_components) and predicted by the model? Below is a code adapted from yours, in my understanding of your problem.

    import tensorflow as tf
    import numpy as np
    from tensorflow.keras.losses import Loss, BinaryCrossentropy
    from tensorflow.keras import Model, Input
    from tensorflow.keras.layers import Dense, Concatenate
    import tensorflow_probability as tfp
    tfd = tfp.distributions
    # parameters
    num_components = 2
    num_samples = 1001
    num_features = 10
    # synthetic data
    x_train = np.random.normal(size=(num_samples, num_features))
    y_train = np.random.normal(size=(num_samples, 1, num_components))
    class NegLogLikMixedGaussian(Loss):
        Negative Log-Likelihood of Mixed Gaussian with:
            num_components: number of components
            mu: means of the Gaussian components
            sg: standard deviations of the Gaussian components
        def __init__(self, num_components=2, name='neg_log_lik_mixed_gaussian'):
            super(NegLogLikMixedGaussian, self).__init__(name=name)
            self.num_components = num_components
        def call(self, y_true, p_predict):
            Rem: for MDN the output of the networks are _parameters_ of the
            predicted distribution, _not_ point-estimates
            y_true: (batch_size, 1, num_components)
                Observed value of the random variable
            p_predict: (batch_size, 3, num_components)
                Output parameters of the network given some input
            Negative log likelihood of the batch (batch_size, 1)
            alpha, mu, sg = tf.split(p_predict, num_or_size_splits=3, axis=1)
            gm = tfd.MixtureSameFamily(
                components_distribution=tfd.Normal(loc=mu, scale=sg))
            log_likelihood = gm.log_prob(y_true)
            return -tf.reduce_mean(log_likelihood, axis=[1, 2])
    # the model (simple predicting (alpha, mu, sigma))
    input = Input((num_features,))
    alpha = tf.expand_dims(Dense(num_components, 'relu')(input), axis=1)+0.0001
    # normalization
    alpha = alpha/tf.reduce_sum(alpha, axis=2, keepdims=True)
    mu = tf.expand_dims(Dense(num_components)(input), axis=1)
    # sg > 0
    sg = tf.expand_dims(Dense(num_components, 'relu')(input), axis=1)+ 0.0001
    outputs = Concatenate(axis=1)([alpha, mu, sg])
    model = Model(inputs=input, outputs=outputs, name='gmm_params')
    model.compile(optimizer='adam', loss=NegLogLikMixedGaussian(num_components=num_components), run_eagerly=False)
    sample_weight=np.ones((num_samples, ))
    sample_weight[500:] = 0., y_train, batch_size=16, epochs=20, sample_weight=sample_weight)