Search code examples
tensorflowmachine-learningdistributed-computinglstmbackpropagation

Distributed training with LSTM in tensorflow


Is LSTM an algorithm or a node? If using it in a model will the backpropagation conflict if I use distributed training?


Solution

  • LSTM is neither. It's a recurrent neural network (see this post). In terms of tensorflow, you might get confused, because there's a notion of a cell (e.g., BasicLSTMCell), that's basically a factory for creating cells that form one or several layers. In the end, it all translates to nodes in the computational graph. You can find a good usage example in this notebook. By the way, the algorithm for training is the same - backprop.

    Now, concerning distributed training, there are two types of parallelism: data and model parallelism, and none of them breaks backprop. The only exception is possibly data parallelism with async updates, which indeed requires certain tricks to work, but there's no first-class support for it in tensorflow. I think you are better with simpler methods to distribute your model (see this post). So the answer is, most likely: no, backprop will work fine.