Search code examples
python-3.xtensorflowgradient-descent

TensorFlow: Where is the actual implementation of RMSprop?


In "rmsprop.py" (in TensorFlow) there is a call to the method apply_rms_prop. This method is defined in "gen_training_ops.py". In the definition of this method there is a comment describing what it is supposed to do:

ms <- rho * ms_{t-1} + (1-rho) * grad * grad
mom <- momentum * mom_{t-1} + lr * grad / sqrt(ms + epsilon)
var <- var - mom

But I can't seem to find where the actual python implementation of the pseudo code above is. My guess is that it is implemented in cpython since I was able to find the file "__pycache__/rmsprop.cpython-36.pyc". But again, where is the cpython implementation that performs the pseudo code above?

My goal is to implement my own gradient update methods, so I need to see some concrete implementation examples (e.g. rmsprop, adam, etc.). Any help would be much appreciated!


Solution

  • You can implement your own optimizer from the Optimizer class. You have to implement at least one of the method _apply_dense or _apply_sparse.

    A complete implementation of adamax optimizer using pure already available tensorflow ops.

    class AdamaxOptimizer(optimizer.Optimizer):
    ..
       you can create slot variables implementing slot fucntion
       def _create_slots(self, var_list):
           ...
       def _apply_dense(self, grad, var):
            implement your logic for gradient updates here.