As the documentation of tf.dense states for this layer the output tensor is the same shape as inputs except the last dimension is of size units. I was trying to have similar behavior in Chainer but I was not successful.
In Tensorflow one can have a (32, 28, 28, 512) tensor and feed it as input to a linear layer and get a (32, 28, 28, 256). As I researched about the tf.dense, seems like when the input has more than 2 dimensions, it shares the weights and it doesn't flatten the input before performing the function.
The chainer.links.Linear does flatten the input and as a result, it does not fit in the memory. I was wondering if it's possible to have the same functionality as in tf.dense somehow in Chainer?
How about reshape
the input before and after applying L.Linear
?
import chainer.functions as F
import chainer.links as L
l = L.Linear(512, 256)
# x is (32, 28, 28, 512)
s0, s1, s2, s3 = x.shape
h= F.reshape(x, (s0*s1*s2, s3)
h = l(h)
h = F.reshape(x, (s0, s1, s2, 256))
# Now h should be (32, 28, 28, 256)