Search code examples
pythonpytorchtensordropout

Why is the Pytorch Dropout layer affecting all values, not only the ones set to zero?


The dropout layer from Pytorch changes the values that are not set to zero. Using Pytorch's documentation example: (source):

import torch
import torch.nn  as nn

m = nn.Dropout(p=0.5)
input = torch.ones(5, 5)
print(input)
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

Then I pass it through a dropout layer:

output = m(input)
print(output)
tensor([[0., 0., 2., 2., 0.],
        [2., 0., 2., 0., 0.],
        [0., 0., 0., 0., 2.],
        [2., 2., 2., 2., 2.],
        [2., 0., 0., 0., 2.]])

The values that aren't set to zero are now 2. Why?


Solution

  • It is how the dropout regularization works. After a dropout the values are divided by the keeping probability (in this case 0.5).

    Since PyTorch Dropout function receives the probability of zeroing a neuron as input, if you use nn.Dropout(p=0.2) that means it has 0.8 chance of keeping. so the values on the table will be 1/(1-0.2).

    This is called "inverted dropout technique" and it is done in order to ensure that the expected value of the activation remains the same.