Search code examples
pytorch

How to set requires_grad_ to false (freeze?) PyTorch Lazy layers?


PyTorch offers lazy layers e.g. torch.nn.LazyLinear. I want to freeze some of these layers in my network i.e. ensure they take no gradient steps. When I try .requires_grad_(False), I receive the error:

ValueError: Attempted to use an uninitialized parameter in <method 'requires_grad_' of 'torch._C._TensorBase' objects>. This error happens when you are using a LazyModuleor explicitly manipulatingtorch.nn.parameter.UninitializedParameterobjects. When using LazyModules Callforward with a dummy batch to initialize the parameters before calling torch functions

How do I freeze lazy layers?


Solution

  • As the error message suggests you need to perform a dummy inference on a lazy module in order to fully initialize all of its parameters.

    In other words, considering your correct input shape shape:

    >>> model(torch.rand(shape))
    

    And only then can you freeze the desired layer with requires_grad_(False).