Search code examples
pythonscipy

Why does `scipy.optimize.minimize` always return the initial value?


I have a complete working code that uses scipy.optimize.minimize but always returns the initial value as the optimized scalar parameter. Here is the complete code:

import sys
import random
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt

# Define the real shift
shift = 50.5

data1_x = []
data1_y = []
data2_x = []
data2_y = []
for index in range(int(shift)):
    data2_x.append(index)
    data2_y.append(0)

for index in range(500):
    x = index
    if index<100:
        y = 0
    elif index<200:
        y = (index-100)
    elif index<300:
        y = 100
    elif index<400:
        y = 400 - index
    else:
        y = 0

    data1_x.append(x)
    data1_y.append(y)
    data2_x.append(x + shift)
    data2_y.append(y)

index_range = range(len(data2_x))

# The function to minimize, returning a float
def overlap(shift, data1_x, data1_y, data2_x, data2_y):
    sum_ = 0
    for index1 in range(len(data1_x)):
        x1 = data1_x[index1] + shift[0]
        index2 = min(index_range, key=lambda i: abs(data2_x[i]-x1))
        x2 = data2_x[index2]
        y1 = data1_y[index1]
        y2 = data2_y[index2]

        # Ignore x values outside of common range
        if abs(x2-x1)>5:
            continue
        sum_ += abs(y2 - y1)
        
    return sum_

# Here chose some other initial value instead of '40'.
result = opt.minimize(overlap, 40, args=(data1_x, data1_y, data2_x, data2_y))

# Print message indicating why the process terminated
print(result.message)

# Print the minimum value of the function
print(result.fun)

# Print the x-value resulting in the minimum value
print(result.x)
calculated_shift = result.x[0]

# Plot the original and shifted signals along with cross-correlation
plt.subplot(2, 1, 1)
plt.scatter(data1_x, data1_y, s=20, marker="o", c="b", label="Data1")
plt.scatter(data2_x, data2_y, s=5, marker="o", c="g", label="Data2")
plt.legend()

plt.subplot(2, 1, 2)
plt.scatter(data1_x, data1_y, s=20, marker="o", c="b", label="Data1")
plt.scatter([x-calculated_shift for x in data2_x], data2_y, s=5, marker="o", c="g", label="Data2")
plt.legend()
plt.tight_layout()
plt.show()

Why does optimize not optimize?


Solution

  • minimize() assumes that the function it is optimizing is differentiable. This function is not differentiable.

    You can see this by adding a print statement right before the end of your function.

    print(f"{shift[0]=} {sum_=}")
    return sum_
    

    When optimizing, this prints the following.

    shift[0]=40.33661619031734 sum_=2000
    shift[0]=40.336616205218505 sum_=2000
    

    It gives two inputs which differ by 10^-8, and gets back exactly the same result. Therefore, it concludes that the function is flat around this point.

    There are three things you can try in this situation:

    1. Rewrite the function to be differentiable, so that when it gets slightly closer to the right answer, the return value gets slightly smaller.

    2. Use brute force. Since only integer changes in the function change the output, you could write a program that loops over all integers that the shift could be, and try all of them.

    3. Use an optimizer that doesn't require the function to be differentiable.

    4. minimize() can be forced to take larger steps when differentiating through the eps option. I also found it was necessary to change from BFGS to L-BFGS-B.

      This can find a shift value of ~50.

      result = opt.minimize(overlap, 40, args=(data1_x, data1_y, data2_x, data2_y), options={'eps': 1}, method='L-BFGS-B')