Search code examples
kerasimbalanced-data

Change learning rate within minibatch - keras


I have a problem with imbalanced labels, for example 90% of the data have the label 0 and the rest 10% have the label 1.

I want to teach the network with minibatches. So I want the optimizer to give the examples labeled with 1 a learning rate (or somehow change the gradients to be) greater by 9 than those with label 0.

is there any way of doing that?

The problem is that the whole training process is done in this line:

history = model.fit(trainX, trainY, epochs=1, batch_size=minibatch_size, validation_data=(valX, valY), verbose=0)

is there a way to change the fit method in the low level?


Solution

  • You can try using the class_weight parameter of keras.

    From keras doc:

    class_weight: Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only).

    Example of using it in imbalance data: https://www.tensorflow.org/tutorials/structured_data/imbalanced_data#class_weights

    class_weights={"class_1": 1, "class_2": 10}
    history = model.fit(trainX, trainY, epochs=1, batch_size=minibatch_size, validation_data=(valX, valY), verbose=0, class_weight=class_weights)
    

    Full example:

    # Examine the class label imbalance
    # you can use your_df['label_class_column'] or just the trainY values.
    neg, pos = np.bincount(your_df['label_class_column'])
    total = neg + pos
    print('Examples:\n    Total: {}\n    Positive: {} ({:.2f}% of total)\n'.format(
        total, pos, 100 * pos / total))
    
    # Scaling by total/2 helps keep the loss to a similar magnitude.
    # The sum of the weights of all examples stays the same.
    weight_for_0 = (1 / neg)*(total)/2.0 
    weight_for_1 = (1 / pos)*(total)/2.0
    
    class_weight = {0: weight_for_0, 1: weight_for_1}