Search code examples
pythonpytorchresnet

forwarding residual block layer by layer, result is wrong


I wrote a pytorch code with residual connection as follow

all_module = []
for i in range(3):
  layer = nn.Sequential(
      nn.Conv1d(n_hidden_channels, n_hidden_channels),
      nn.LeakyReLU(),
      nn.Conv1d(n_hidden_channels, n_hidden_channels),
      nn.LeakyReLU()
      )
  all_module.append(layer)
module_list = nn.ModuleList(all_module)


# method 1
for layer in module_list:
  x = x + layer(x)
print(x)

# method 2
for layer in module_list:
  y = torch.clone(x)
  for m in layer:
    y = m(y)
  x = x + y
print(x)

why the output of method 1 and 2 is different?

Have no idea why this happened.


Solution

  • Both methods do the same thing. If you run the two methods sequentially, the first method will update the x; therefore, you will have a different result when you run the second method. If you copy the x before the first method, you will see both methods will create the same results.