Search code examples
pythontensorflowkerasgradient-descent

How to scale the gradient during batch update in keras?


I am using a standard keras model and I am training on batch (using the train_on_batch function). Now, I want to take the gradient of each element in the batch and scale it (multiply each sample gradient with a sample-specific value that I have) and after each gradient has been scaled, then it can be summed and used to update the existing weights. Is there anyway to do this given keras functions? And if not, is there a way for me to manipulate this using tensorflow? (given the model and the rest was written in keras)

The function looks like this: (the loop is to illustrate it happens for all samples in the batch)

grad = 0, w= #array of size batch_size
for i in batch_size:
    grad <- grad + w_i*grad_i

Solution

    • Use the sample_weights argument in the fit method of a model.
    • Or, if using a generator, make the generator return not only X_train, y_train, but X_train, y_train, sample_weights.

    In both cases, sample_weights should be a 1D vector with the same number of samples as the data.