Search code examples
pythonmachine-learningpytorchlearning-rate

PyTorch: Learning rate scheduler


How do I use a learning rate scheduler with the following optimizer?

optimizer = torch.optim.Adam(optim_params,betas=(args.momentum, args.beta), weight_decay=args.weight_decay)

I have written the following scheduler:

scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=100, gamma=0.9)

I am not sure about whether I should step the scheduler or the optimizer. Which order should I take to perform the following?

optimizer.zero_grad()
scheduler.step()
optimizer.step()

Solution

  • Since 1.3 the behaviour was changed, see releases and this issue especially.

    Before this version, you should step scheduler before optimizer, which IMO wasn't reasonable. There was some back and forth (actually it breaks backward compatibility and IMO it's not a good idea to break it for such a minor inconvenience), but currently you should step scheduler after optimizer.

    optimizer.zero_grad()
    optimizer.step()
    scheduler.step()