Search code examples
tensorflow-federated

Does Tensorflow Federated Support Reinforcement Learning


I am trying to train a deep reinforcement learning model in a federated learning scenario. Does Tensorflow Federated (TFF) support reinforcement learning (RL) as an ML model? I understand that Federated Learning is mostly discussed for supervised learning, and I was curious if reinforcement learning could be used in TFF as well.

If so, which library would you recommend to use RL in TFF?


Solution

  • The short answer is yes, TFF can support reinforcement learning at the level of the Federated Core API; note that RL is not currently implemented in tff.learning (though we would welcome such a contribution). From the machine learning point of view, you can think of TFF as a communication layer on top of TF, and anything that TF supports TFF can support.

    I will try to hit on a few key features of the long answer:

    First, federated reinforcement learning is very much an open research question. Given the difficulty in training RL models in general, the FL community I think would be excited to see agents trained in the federated setting reproduce even the classical RL results, and we would be very excited to see such a thing implemented in TFF.

    Second, TFF in general supports any TensorFlow-based iterative learning process, in particular gradient-based learning. One could imagine many possible ways of modeling RL in the federated setting; TFF supports passing around any kind of update, so the sky is the limit in terms of what instantiation of federated RL TFF can support.

    Finally, I think perhaps the place to start in implementing RL in TFF is simply implementing RL in vanilla TensorFlow in a modular way. Any communication that has to happen in your chosen federated model of RL will have to be written between TensorFlow, in TFF. If you implement e.g. your actor and your critic modularly with tf.function, it should be relatively simple to implement the communication you need inside a @tff.federated_computation decorator. For advice on mixing TF and TFF code, see this post by TFF’s lead author.