In my research . I wrote 2 layers in neural network, the bottom first layer is RNN which runs on GPU, the top second layer is CPU(the algorithm model nature is more suited to CPU), I implemented it in CPU in chainer self-defined Link.
But, the CPU layer is slow , I can't wait for deadline of my paper submit. So I want to use parallel computing of this layer.
What is the best practice and fast way to implement parallel this link?
First, ChainerMN (not Chainer) does not support a direct way to accelerate computation within a single layer.
I would recommend to consider the following options.
Thanks.