I'm trying to optimize a function based on discrete data sets I got in the lab.
Spectrum_1
and Spectrum_2
are experimental data sets with length N
. Those two arrays contain values taken for each lambda (wavelength) value in lambdas
array.
So:
len(Spectrum_1) == len(Spectrum_2) == len(lambdas)
Each Spectrum_1
and Spectrum_2
have the symbolic form
Spectrum = G(T,lambda) * E(lambda)
E(lambda) is unknown, but it is known to not vary with T. E(lambda) would be discrete data (a set of points defined on each value in the 'lambdas' array)
G(T, lambda) is a known function of T and wavelength (lambda) (it is a continuous and defined function). It is the Planck blackbody radiation equation in regards to wavelength to be more specific.
E_1
and E_2
should be equal:
E_1 = Spectrum_1/np.array([G(T_1, lamda) for lamda in lambdas])
E_2 = Spectrum_2/np.array([G(T_2, lamda) for lamda in lambdas])
T_1
and T_2
are unknown but I know they are within 400 to 1000 range (both). T_1
and T_2
are two scalar values.
Knowing that, I need to minimize:
np.sum(np.array([(E_1[i] - E_2[i])**2 for i in lamdas]))
Or at least that's what I think I should minimize. Ideally (E_1[i]-E_2[i])==0
but that won't be the case granted the experimental data in Spectrum_1
and _2
contain noise and distortions due to atmospheric transmission.
I'm not very familiar with optimizing with multiple unknown variables (T_1 and T_2)
in Python. I could brute test millions of combinations of T_1
and T_2
I suppose, but I wish to do it correctly.
I wonder if anybody could help me.
I hear scipy.optimize could do it for me, but many methods ask for Jacobian and Hessian and I'm unsure how to proceed given I have experimental data (Spectrum_1
and Spectrum_2
) and I'm not dealing with continuous/smooth functions.
Given the data and assumptions, I see no reason why scipy.optimize.minimize()
wouldn't work.
I hear scipy.optimize could do it for me, but many methods ask for Jacobian and Hessian
Those aren't required. If you don't provide them, they will be estimated using numerical differentiation, i.e. varying the inputs a little and seeing how much the outputs change. This assumes the function is continuous, though.
I'm unsure how to proceed given I have experimental data (
Spectrum_1
andSpectrum_2
) and I'm not dealing with continuous/smooth functions.
As I understand it, you do have a continuous function. You've said that G(T, lambda) is a continuous function. The fact that you're applying it to many different data samples doesn't make it no longer continuous. Similarly, the other operators, -
, squaring, /
, and summing, also are continuous. (With the exception of division by zero - but you can write a constraint to forbid this if G(T, lambda) can be zero.)
You could try something along these lines:
from scipy.optimize import minimize
# Put experimental data here
Spectrum_1 = np.array([...])
Spectrum_2 = np.array([...])
lambdas = np.array([...])
def loss(params):
T_1, T_2 = params
E_1 = Spectrum_1/np.array([G(T_1, lamda) for lamda in lambdas])
E_2 = Spectrum_2/np.array([G(T_2, lamda) for lamda in lambdas])
# Use vectorization to subtract every element of E_1 from E_2
difference = E_1 - E_2
return np.sum(difference**2)
result = minimize(
loss,
x0=[0, 0], # Put some reasonable initial guess here for T_1 and T_2
)
print(result)
If you think this is a function which may have many local minima, you may also want to try one of the global optimizers. Also, all of the global optimizers except basinhopping work even when the function is not continuous.