python optimization tensorflow gradient-descent

Finding implementation of methods in Tensorflow

I want to change a bit minimization optimizers like AdadeltaOptimizer, which is used in Tensorflow. I got the license but there is no code in lib, only reference, so how can I find the implementation? Here is Adadelta example of API:

@tf_export("train.AdadeltaOptimizer") class 
AdadeltaOptimizer(optimizer.Optimizer) 
Optimizer that implements the Adadelta algorithm.
See [M. D. Zeiler](http://arxiv.org/abs/1212.5701) ([pdf]
(http://arxiv.org/pdf/1212.5701v1.pdf))

Solution

The first entry point is python/training/adadelta.py from tensorflow main repo. But you may notice it's a python wrapper, all ops are actually implemented in native C++ and loaded in python (this is the usual practice in tensorflow, see for instance this question: Where is the code for gradient descent?).

For example, in core/kernels/training_ops.cc you can find CPU impelmentation of ApplyAdadelta op. GPU implementation of the same op is in core/kernels/training_ops_gpu.cu.cc:

template <typename T>
struct ApplyAdadelta<GPUDevice, T> {
  void operator()(const GPUDevice& d, typename TTypes<T>::Flat var,
                  typename TTypes<T>::Flat accum,
                  typename TTypes<T>::Flat accum_update,
                  typename TTypes<T>::ConstScalar lr,
                  typename TTypes<T>::ConstScalar rho,
                  typename TTypes<T>::ConstScalar epsilon,
                  typename TTypes<T>::ConstFlat grad) {
    Eigen::array<typename TTypes<T>::Tensor::Index, 1> bcast;
    bcast[0] = grad.dimension(0);
    Eigen::Sizes<1> single;

    accum.device(d) = accum * rho.reshape(single).broadcast(bcast) +
                      grad.square() * (grad.constant(T(1)) -
                                       rho.reshape(single).broadcast(bcast));
    const auto update =
        (accum_update + epsilon.reshape(single).broadcast(bcast)).sqrt() *
        (accum + epsilon.reshape(single).broadcast(bcast)).rsqrt() * grad;
    var.device(d) -= update * lr.reshape(single).broadcast(bcast);
    accum_update.device(d) =
        accum_update * rho.reshape(single).broadcast(bcast) +
        update.square() *
            (grad.constant(T(1)) - rho.reshape(single).broadcast(bcast));
  }
};

If you'd like to patch C++ code, you'll have to rebuild .so library. To be able to run your new optimizer on both CPU and GPU, you'll have to touch and rebuild both.