I am performing function minimization, and when using data in the tuple format, everything works correctly but very slowly. I wanted to optimize the code, so I converted everything to np.array, but now I'm getting different values that don't satisfy me, although the speed is excellent. Help me understand what needs to be done to achieve speed similar to np.array but with data that matches the tuple.
import numpy as np
from scipy.optimize import minimize
import time
import warnings
warnings.filterwarnings("ignore", category=RuntimeWarning)
warnings.filterwarnings("ignore", category=UserWarning)
start_time = time.time()
def Bass1(x, P, Q, M):
return (P * M + (Q - P) * (x)) - (Q / M) * (x ** 2)
def squareMistake1(k: tuple, *sales) -> float:
p0 = 0
c0 = sales[0]
res = 0 # Function value
for i in range(1, len(sales)):
p = Bass1(c0, P=k[0], Q=k[1], M=k[2])
c = c0 + p
res += (c - sales[i]) ** 2
p0 = p
c0 = c
return res
# Prepare data for minimization
# Initial parameter values
k0 = [0.0008791696672306727, 0.19252826585535315, 3328.9193848309856]
# All parameters are non-negative
kb = ((0, None), (0, None), (0, None))
# The dataset that is needed, ideal values, long execution time
generate = tuple([8.26192344363636, 9.20460066059596, 12.0178164697778, 15.921260267805, 21.2161740066094, 31.420434564131, 38.3904519471421, 52.3307819867071, 62.9113953016839, 85.1161924282732, 104.083879757882, 132.859216030029, 170.682620580279, 220.600045153997, 276.020526299077, 346.465021938078, 440.385091980306, 530.55442135112, 635.49205101167, 705.805860788812, 831.42968828187, 962.227395409379, 1140.31094904253, 1269.52053571083, 1418.17004626655, 1591.2135122193])
# A similar dataset, values are not satisfactory, short execution time
# generate = np.array([8.26192344363636, 9.20460066059596, 12.0178164697778, 15.921260267805, 21.2161740066094, 31.420434564131, 38.3904519471421, 52.3307819867071, 62.9113953016839, 85.1161924282732, 104.083879757882, 132.859216030029, 170.682620580279, 220.600045153997, 276.020526299077, 346.465021938078, 440.385091980306, 530.55442135112, 635.49205101167, 705.805860788812, 831.42968828187, 962.227395409379, 1140.31094904253, 1269.52053571083, 1418.17004626655, 1591.2135122193])
# List of methods used in minimization
method_list = ['Nelder-Mead', 'Powell', 'L-BFGS-B', 'TNC', 'SLSQP', 'trust-constr']
for i in method_list:
# Minimize the sum of squares
try:
res = minimize(squareMistake1, k0, args=generate, method=i, bounds=kb)
except:
print(f'Minimization using {i} method failed!')
k = tuple(res.x) # Get a tuple of parameters
print(f'k = {k}')
end_time = time.time()
total_time = end_time - start_time
print(f'Execution time: {total_time} seconds')
np.array
and pd.DataFrame
produce one result, while tuple
produces another. I can't figure out what the issue is.
Take a look at how args
is processed:
https://github.com/scipy/scipy/blob/f990b1d2471748c79bc4260baf8923db0a5248af/scipy/optimize/_minimize.py#L538
If args
isn't a tuple, it is packed into one. So if args
is an array, your objective function is expecting to receive many separate arguments, but it gets only one array. len(sales)
is 1, so the for
loop doesn't run and the function returns 0.
The documentation of minimize
specifies that the type of args
must be tuple
, so please pass in a tuple for best results.