I'm trying to do a multi-gaussian deconstruction of a spectral measurement I took, but I'm getting ValueError: 'x0' is infeasible
. I'm using curve_fit
and I'm trying to break down my measurement into 3 Gaussians. Here's my code:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
directory=r"C:\Users\jh\data"
filename = "test(0,1).txt"
filepath = os.path.join(directory, filename)
data = pd.read_csv(filepath, delimiter = "\t")
x = data['Wavelength (nm)']
y = data['Absorbance']
#Gaussian 1
amp1 = 100
cen1 = 450
sig1 = 15
#Gaussian 2
amp2 = 50
cen2 = 550
sig2 = 10
#Gaussian 3
amp3 = 20
cen3 = 760
sig3 = 5
p0 = [amp1, cen1, sig1, amp2, cen2, sig2, amp3, cen3, sig3]
def gaussian(x, amp, cen, sig):
gau = amp*(1/(sig*(np.sqrt(2*np.pi))))*(np.exp((-1/2)*(((x-cen)/sig)**2)))
return gau
def three_gaussian(x, amp1, cen1, sig1, amp2, cen2, sig2, amp3, cen3, sig3): #Amp: Amplitude, Cen: Center, Sig: Sigma
gau1 = amp1*(1/(sig1*(np.sqrt(2*np.pi))))*(np.exp((-1/2)*(((x-cen1)/sig1)**2)))
gau2 = amp2*(1/(sig2*(np.sqrt(2*np.pi))))*(np.exp((-1/2)*(((x-cen2)/sig2)**2)))
gau3 = amp3*(1/(sig3*(np.sqrt(2*np.pi))))*(np.exp((-1/2)*(((x-cen3)/sig3)**2)))
return gau1 + gau2 + gau3
popt, pcov = curve_fit(three_gaussian, x, y, p0, bounds = ((400, -np.inf, -np.inf, 500, -np.inf, -np.inf, 650, -np.inf, -np.inf), (470, np.inf, np.inf, 600, np.inf, np.inf, 900, np.inf, np.inf)))
RRa_pars = popt[0:3]
RRe_pars = popt[3:6]
Pol_pars = popt[6:9]
RRa = gaussian(x, *RRa_pars)
RRe = gaussian(x, *RRe_pars)
Pol = gaussian(x, *Pol_pars)
I'm positive that the problem is this line from the curve_fit
call:
popt = curve_fit(three_gaussian, x, y, p0, bounds = ((400, -np.inf, -np.inf, 500, -np.inf, -np.inf, 650, -np.inf, -np.inf), (470, np.inf, np.inf, 600, np.inf, np.inf, 900, np.inf, np.inf)))
I previously tried this code without the bounds and it was working, but the model was inaccurate to what I was expecting. So I introduced the bounds to try and constrain the center to where I was expecting. That's what introduced the ValueError: 'x0' is infeasible
. This is what the fit looks like without the bounds:
The fit is good, but the two Gaussians should be closer in amplitude and they should be more shifted left. (I'm neglecting the last Gaussian in this image because it's not the one with a problem).
I'm not really sure how to fix this problem.
When you provide bounds and an initial guess, the initial guess must satisfy the bounds, otherwise the initial guess is considered "infeasible" (that's just another way of saying it doesn't satisfy the constraints). Consider this modified version of the scipy.optimize.curve_fit
example from the documentation.
import numpy as np
from scipy.optimize import curve_fit
def func(x, a, b, c):
return a * np.exp(-b * x) + c
xdata = np.linspace(0, 4, 50)
rng = np.random.default_rng(42)
y_noise = 0.2*rng.normal(size=xdata.size)
ydata = func(xdata, 2.5, 1.3, 0.5) + y_noise
If we call the curve fit like so:
popt, pcov = curve_fit(func, xdata, ydata, p0=(2., 0.6, 0.2), bounds=(0, [3., 2., 1.]))
There aren't any issues because the initial guess satisfies the constrains.
But if we call it like this:
popt, pcov = curve_fit(func, xdata, ydata, p0=(4., 0.6, 0.2), bounds=(0, [3., 2., 1.]))
We get ValueError: 'x0' is infeasible
since the a
initial guess is not within the bounds.