Apologies if the terminology in the title is strange or incorrect, I am trying to refer to the following scenario:
As a minimal example, I define a network as follows:
class Convolution_Layers(nn.Module):
def __init__(self, in, out, kernel):
super(Convolution_Layers, self).__init__()
self.conv2d = nn.Conv2d(in_channels=in, out_channels=out, kernel_size=kernel)
self.conv2d_layers = nn.Sequential(
self.conv2d,
nn.ReLU,
)
forward(self,x):
return self.conv2d_layers(x)
class Network_Model(nn.Module):
def __init__(self):
super(Network_Model, self).__init__()
self.basic_conv = Convolution_layers(1,1,3)
self.subnetwk_1 = nn.ModuleList().append([self.basic_conv])
self.subnetwk_2 = nn.ModuleList().append([self.basic_conv])
def forward(self,x1,x2):
out1, out2 = x1, x2
for l in self.subnetwk_1:
out1 = l(x1)
for l in self.subnetwk_2:
out2 = l(x2)
return out1,out2
I would like to know if this would result in the weights in subnetwork 1 and 2 being shared, since they come from the same instance of Convolution layers.
Ideally I would like to have the weights be separate, but be able to create the basic convolution block only once, and then re-use it elsewhere. There may be a better way of accomplishing this.
You have a typo in your code, the instance shouldn't be called when appended to the module lists. To answer your question, yes both sub-networks will share the same weights since you appended a unique instance and not two.
shared_conv = Convolution_layers(1,1,3)
self.subnetwk_1 = nn.ModuleList([shared_conv])
self.subnetwk_2 = nn.ModuleList([shared_conv])
What does "be able to create the basic convolution block only once" mean? If you are looking to have the two sub-networks share the same architecture but with separate weights, then you need to initialize two layers:
self.subnetwk_1 = nn.ModuleList([Convolution_layers(1,1,3)])
self.subnetwk_2 = nn.ModuleList([Convolution_layers(1,1,3)])
If you want separate sub-networks but share their arguments, you can use keyword arguments and pass a unique dictionary multiple times to the init function:
params = dict(in=1, out=1, kernel=3)
self.subnetwk_1 = nn.ModuleList([Convolution_layers(*params)])
self.subnetwk_2 = nn.ModuleList([Convolution_layers(*params)])
Or depending on the complexity of your initialization, make use of a helper function, maybe that's what you meant by "self.basic_conv()
" in your code snippet:
class Network_Model(nn.Module):
def __init__(self):
super(Network_Model, self).__init__()
self.subnetwk_1 = nn.ModuleList([self.basic_conv()])
self.subnetwk_2 = nn.ModuleList([self.basic_conv()])
def basic_conv(self):
return Convolution_layers(1, 1, 3)