Search code examples
pythonscipyleast-squares

Least Squares Method for a sum of functions


I would like to use the curve_fit function from the scipy.optimize module to determine amplitudes, frequencies, phases of sum of sine functions (and one y0). It's easy to do when I know a number of sines to use. For example when I know two frequencies from the DFT (Discrete Fourier Transform): 1.152 and 0.432 I can define a function:

def func(x, amp1, amp2, freq1 , freq2, phase1, phase2, y0):
    return amp1*np.sin(freq1*x + phase1) + amp2*np.sin(freq2*x + phase2) + y0

Then, using the curve_fit and constraining intervals of frequencies I can find a good fitting:

param, _ = curve_fit(func, t, data, bounds=([-np.inf, -np.inf, 1.14, 0.43, -np.inf, -np.inf, -np.inf], [np.inf, np.inf, 1.16, 0.44, np.inf, np.inf, np.inf]))

It looks great: enter image description here

But in this case I've prepared the data and I've known a number of frequencies. Do you know how to define the func only once and handle all cases (for example five sine functions)? I've tried to put the parameters into lists, e.g. amp = [amp1, amp2, ... ] and I've iterated over their length. But there is a problem to define bounds for parameter lists. bounds is very important to ensure reality model.

The solution does not have to based on curve_fit.


Solution

  • Assuming you know the frequencies beforehand the problem is simple. You can set the lower bound to 0 and set the upper bound to 2 * pi * freq for frequency. For amps, set any number (or np.inf if you want no boundary).

    You can formulate the function in the form lambda x, amp1, phase1, amp2, phase2... : y, curve_fit can accept a function of undefined number of arguments as long as you supply a proper initial guess.

    A sample code for five frequencies:

    import numpy as np
    import matplotlib.pyplot as plt
    from scipy.optimize import curve_fit
    
    x = np.linspace(0,10,60)
    w = [1,2,3,4,5]
    
    a = [1,4,2,3,0.1]
    x0 = [0,1,0,1,0.5]
    
    y = np.sum(a_i * np.sin(w_i * x - x0_i) for w_i, a_i, x0_i in zip(w,a, x0))  #base_data
    yr = y + np.random.normal(0,0.5, size=x.size)   #noisy data
    
    def func(x, *args):
        """ function of the form lambda x, amp1, phase1, amp2, phase2...."""
        return np.sum(a_i * np.sin(w_i * (x-x0)) for w_i, a_i, x0
                      in zip(w,args[::2], args[1::2]))
    
    ubounds = np.zeros(len(w) * 2)
    ubounds[::2] = 10   #setting amp max value to 10 (arbitrary)
    ubounds[1::2] = np.asarray(w) * 2 * np.pi
    p0 = [0] * 10   # note p0 size
    popt, pcov = curve_fit(func, x, yr, p0, bounds=(0, ubounds))
    amps, phases = popt[::2], popt[1::2]
    
    plt.plot(x,func(x, *popt))
    plt.plot(x,yr, 'go')