I have a Nx1xHxW feature maps. I need to add a second head that generates Nx3xHxW representing pixel wise regression with a triplet for each pixel .
The question is: how would you go from Nx1xHxW to Nx3xHxW? A fully connected layer would be too expensive in terms of introduced parameters.
That's what I am trying a 1x1x3 convolutional filter with stride 1 defined in PyTorch as nn.Conv2d(1, 3, (1, 1), stride=1, bias=True) but results does not seem encouraging. Any suggestion would be welcome.
Best
You can expand the dimension of the data at any point in the forward function with non-parametric operations to force the output into this shape. For instance:
def forward(input):
input = input.repeat(1,3,1,1)
output = self.layers(input)
return output
or:
def forward(input):
intermediate = self.layers(input)
intermediate.repeat(1,3,1,1)
output = self.more_layers(intermediate)
return output
Theoretically, there is some nonlinear function that produces the 3d pixelwise output given a 1-dimensional input. You can try and learn this nonlinear function using a series of NN layers, but, as you indicated above, this may not give great results and moreover may be difficult to learn well. Instead, you can simply expand the input at some point so that you are instead learning a 3d to 3d pixelwise nonlinear function with NN layers. torch.repeat
and other similar operations are differentiable so shouldn't cause an issue with learning.