python pytorch mathematical-optimization torch

torch.optim.LBFGS() does not change parameters

I'm trying to optimize the coordinates of the corners of an image. A similar technique works fine in Ceres Solver. But in torch.optim I'm having some issues. In particular, the optimizer for some reason does not change the parameters being optimized. I don't have much experience with pytorch, so I'm pretty sure the error is trivial. Unfortunately, reading the documentation did not help me much.

Optimization model class:

class OptimizeCorners(torch.nn.Module):
    def __init__(self, real_corners):
        super().__init__()
        self._real_corners = torch.nn.Parameter(real_corners)

    def forward(self, real_image, synt_image, synt_corners, _threshold):
        # Find homography
        if visualize_warp_interpolate:
            real_image_before_processing = real_image
            synt_image_before_processing = synt_image
        homography_matrix = kornia.geometry.homography.find_homography_dlt(synt_corners,
                                                                           self._real_corners,
                                                                           weights=None)
        # Warp and resize synt image
        synt_image = kornia.geometry.transform.warp_perspective(synt_image.float(),
                                                                homography_matrix,
                                                                dsize=(int(real_image.shape[2]),
                                                                       int(real_image.shape[3])),
                                                                mode='bilinear',
                                                                padding_mode='zeros',
                                                                align_corners=True,
                                                                fill_value=torch.zeros(3))
        # Interpolate images
        real_image = torch.nn.functional.interpolate(real_image.float(),
                                                     scale_factor=5,
                                                     mode='bicubic',
                                                     align_corners=None,
                                                     recompute_scale_factor=None,
                                                     antialias=False)
        synt_image = torch.nn.functional.interpolate(synt_image.float(),
                                                     scale_factor=5,
                                                     mode='bicubic',
                                                     align_corners=None,
                                                     recompute_scale_factor=None,
                                                     antialias=False)

        # Calculate loss
        loss_map = torch.sub(real_image, synt_image, alpha=1)

        # if element > _threshold: element = 0
        loss_map = torch.nn.Threshold(_threshold, 0)(loss_map)

        cumulative_loss = torch.sqrt(torch.sum(torch.pow(loss_map, 2)) /
                                     (loss_map.size(dim=2) * loss_map.size(dim=3)))

        return torch.autograd.Variable(cumulative_loss.data, requires_grad=True)

The way, how I am trying to execute optimization:

# Convert corresponding images to PyTorch tensors
_image = kornia.utils.image_to_tensor(_image, keepdim=False)
_synt_image = kornia.utils.image_to_tensor(_synt_image, keepdim=False)
_corners = torch.from_numpy(_corners)
_synt_corners = torch.from_numpy(_synt_corners)
# Optimizer L-BFGS
n_iters = 100
h_lbfgs = []
lr = 1
optimize_corners = OptimizeCorners(_corners)
optimizer = torch.optim.LBFGS(optimize_corners.parameters(),
                              lr=lr)
for it in tqdm(range(n_iters), desc='Fitting corners',
               leave=False, position=1):
    loss = optimize_corners(_image, _synt_image, _synt_corners, _threshold)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step(lambda: optimize_corners(_image, _synt_image, _synt_corners, _threshold))
    h_lbfgs.append(loss.item())
    print(h_lbfgs)

Output from console: pic

So, as you can see, parameters to be optimized do not change.

UPD: I changed return torch.autograd.Variable(cumulative_loss.data, requires_grad=True) to return cumulative_loss.requires_grad_(), and it actually works, but now I get this error after few iterations: console output

UPD: this happens because the parameters being optimized turn into NaN after a few iterations.

Solution

After some time spent hugging the debugger, I found out that the main problem is that after a few iterations, the backward() method starts to calculate the gradient incorrectly and output NaN's. Thus, the parameters being optimized are also calclulated as NaN's. I didn't have a chance to find out exactly why this is happening, because all the traces (I used torch.autograd.set_detect_anomaly(True) method) pointed to the fact that the error occurs on the side of the C ++ Torch engine in the POW and SVD functions. In the end, in my case, the problem was solved by the fact that I cast all parameters form float32 to float64 and reduce learning rate.

Here is the final code update can be found:

# Convert corresponding images to PyTorch tensors
            _image = kornia.utils.image_to_tensor(_image, keepdim=False).double()
            _synt_image = kornia.utils.image_to_tensor(_synt_image, keepdim=False).double()
            _corners = torch.from_numpy(_corners).double()
            _synt_corners = torch.from_numpy(_synt_corners).double()

            # Optimizer L-BFGS
            optimize_corners = OptimizeCorners(_corners)
            optimizer = torch.optim.LBFGS(optimize_corners.parameters(),
                                          max_iter=20,
                                          lr=0.01)
            torch.autograd.set_detect_anomaly(True)

            def closure():
                optimizer.zero_grad()
                loss = optimize_corners(_image, _synt_image, _synt_corners, _threshold)
                loss.backward()
                return loss

            for it in tqdm(range(100), desc="Fitting corners", leave=False, position=1):
                optimizer.step(closure)


def forward(self, real_image, synt_image, synt_corners, _threshold):
    # Find homography
    if visualize_warp_interpolate:
        real_image_before_processing = real_image
        synt_image_before_processing = synt_image

    homography_matrix = kornia.geometry.homography.find_homography_dlt(synt_corners,
                                                                       self._real_corners,
                                                                       weights=None)

    # Warp and resize synt image
    synt_image = kornia.geometry.transform.warp_perspective(synt_image,
                                                            homography_matrix,
                                                            dsize=(int(real_image.shape[2]),
                                                                   int(real_image.shape[3])),
                                                            mode='bilinear',
                                                            padding_mode='zeros',
                                                            align_corners=True,
                                                            fill_value=torch.zeros(3))

    # Interpolate images
    real_image = torch.nn.functional.interpolate(real_image,
                                                 scale_factor=10,
                                                 mode='bicubic',
                                                 align_corners=None,
                                                 recompute_scale_factor=None,
                                                 antialias=False)
    synt_image = torch.nn.functional.interpolate(synt_image,
                                                 scale_factor=10,
                                                 mode='bicubic',
                                                 align_corners=None,
                                                 recompute_scale_factor=None,
                                                 antialias=False)
    # Calculate loss
    loss_map = torch.sub(real_image, synt_image, alpha=1)

    # if element > _threshold: element = 0
    loss_map = torch.nn.Threshold(_threshold, 0)(loss_map)

    cumulative_loss = torch.sqrt(torch.sum(torch.pow(loss_map, 2)) /
                                 (loss_map.size(dim=2) * loss_map.size(dim=3)))

    return cumulative_loss.requires_grad_()