Search code examples
pythonoptimizationpytorch

Minimizing a function using PyTorch Optimizer - Return values are all the same


I'm trying to minimize a function in order to better understand the optimizer process. As an example I used the Eggholder-Function (https://www.sfu.ca/~ssurjano/egg.html) which is 2d. My goal is to get the values of my parameters (x and y) after every optimizer iteration so that i can visualize it afterwards.

Using Pytorch I wrote the following code:

def eggholder_function(x):
    return -(x[1] + 47) * torch.sin(torch.sqrt(torch.abs(x[1] + x[0]/2 + 47))) - x[0]*torch.sin(torch.sqrt(torch.abs(x[0]-(x[1]+47))))

def minimize(function, initial_parameters):
    list_params = []
    params = initial_parameters
    params.requires_grad_()
    optimizer = torch.optim.Adam([params], lr=0.1)

    for i in range(5):
        optimizer.zero_grad()
        loss = function(params)
        loss.backward()
        optimizer.step()
        list_params.append(params)
        


    return params, list_params

starting_point = torch.tensor([-30.,-10.])
minimized_params, list_of_params = minimize(eggholder_function, starting_point)

The output is as follows:

minimized_params: tensor([-29.4984, -10.5021], requires_grad=True)

and

list of params:


[tensor([-29.4984, -10.5021], requires_grad=True),
 tensor([-29.4984, -10.5021], requires_grad=True),
 tensor([-29.4984, -10.5021], requires_grad=True),
 tensor([-29.4984, -10.5021], requires_grad=True),
 tensor([-29.4984, -10.5021], requires_grad=True)]

While I understand that the minimized_params is infact the optimized minimum, why does list_of_params show the same values for every iteration?

Thank you and have a great day!


Solution

  • Because they all refer to the same object. You can check it by:

    id(list_of_params[0]), id(list_of_params[1])
    

    You can clone the params to avoid that:

    import torch
    def eggholder_function(x):
        return -(x[1] + 47) * torch.sin(torch.sqrt(torch.abs(x[1] + x[0]/2 + 47))) - x[0]*torch.sin(torch.sqrt(torch.abs(x[0]-(x[1]+47))))
    
    def minimize(function, initial_parameters):
        list_params = []
        params = initial_parameters
        params.requires_grad_()
        optimizer = torch.optim.Adam([params], lr=0.1)
    
        for i in range(5):
            optimizer.zero_grad()
            loss = function(params)
            loss.backward()
            optimizer.step()
            list_params.append(params.detach().clone()) #here
            
        return params, list_params
    
    starting_point = torch.tensor([-30.,-10.])
    minimized_params, list_of_params = minimize(eggholder_function, starting_point)
    
    #list_params
    [tensor([-29.9000, -10.1000]),
     tensor([-29.7999, -10.2001]),
     tensor([-29.6996, -10.3005]),
     tensor([-29.5992, -10.4011]),
     tensor([-29.4984, -10.5021])]