I am re-produce a paper https://arxiv.org/abs/1711.11575 : where it has one formula:
But I searched chainer, it only has F.softmax,but it cannot add weight on to it. How to reimplement that formula?
If you want to add the term w_G^{mn}
, I guess "adding a value log(w_G^{mn})" to each output before applying usual softmax should have a same effect.