Search code examples
pythonmultidimensional-arrayleast-squaresscipy-optimizesigmoid

fitting a group of sigmoids with different models with least_squares()?


For a series of experiments measured at different time points, I'm trying to compare fits that fixed parameters for all experiments with fits that have an individual set of parameters per experiment, using an information criterion (like Akaike IC). Is that possible with scipy.optimize.least_squares?

Let's say that the measurements roughly follow sigmoid curves in time with parameters base (level before step), height (increase in level after step), stretch (duration of the step) and t50 (the time where the level is base + height/2). I model these in the code below, which returns a size [t, c] numpy ndarray measurements with t time points (experiments) and c values per time point:

import numpy as np;
import matplotlib.pyplot as plt;

# lifted, scaled, stretched and shifted sigmoid
def lssig ( t, base, height, stretch, t50 ):
return base + height / ( 1 + np.exp ( -1/stretch * ( t - t50 ) ) );

# number of curves per experiment
numcurves = 10;

# number of experiments (time pts)
num_experiments = 20;

# timepoints (with simulated variability in timing)
time_jitter = 2;
tstart  = np.linspace   ( -10, 30, num = num_experiments );
tstart += time_jitter * np.random.rand ( num_experiments );

# measurements + added noise (this is added noise, not the variability in the estimates)
measurements  = np.ndarray       ( shape = ( num_experiments, numcurves ) );
measure_noise = np.random.normal ( 0, .2,  ( num_experiments, numcurves ) );

# Variability of model parameters between curves:   
# par_spreading [0] -> variability in base (start value)
# par_spreading [1] -> variability in height (end - start)
# par_spreading [2] -> variability in stretch (duration of step)
# par_spreading [3] -> variability in t50 (middle of step)
par_spreading = np.ones ( 4 );

# parameters of the model:   
# every curve can have its own parameters, variability between curves 
# is given by the par_spreading value (see above) for that parameter    
height  = 8 * np.ones ( numcurves ) + par_spreading [0] * np.random.rand ( numcurves);
base    = 2 * np.ones ( numcurves ) + par_spreading [1] * np.random.rand ( numcurves);
stretch = 4 * np.ones ( numcurves ) + par_spreading [2] * np.random.rand ( numcurves);
t50     = 9 * np.ones ( numcurves ) + par_spreading [2] * np.random.rand ( numcurves);

# fill the measurement array
for t in range(num_experiments):
    for c in range (numcurves):
    
        measurements[ t, c ] = lssig ( tstart [t],
                                     base  [c], 
                                     height   [c],
                                     stretch   [c],
                                     t50 [c],
                                     );
measurements += measure_noise;
   
# plot curves per subject
plt.plot ( tstart, measurements );
plt.show ();

different time points (from left to right) with different experiments at each point (different colours)

There is a very nice page with examples of using least_squares, where 1 sigmoid is fitted, and the parameters of the sigmoid are passed in an array theta. Using that recipe I can fit the curve of the 1st experiment at every time point measurements[:,0]:

from scipy.optimize import least_squares;

# point on the sigmoid, using array theta for parameters
def y ( theta, t ):
    return theta[0] + theta[1] / ( 1 + np.exp ( -1/theta[2] * ( t - theta[3] ) ) );

# objective function for least_squares: difference between noisy data & model
def fun(theta):
    return y(theta, tstart) - measurements[:,0];

# start with values for base, step, stretch and t50
theta0 = [8,2,4,9];
res1   = least_squares(fun, theta0);

# show the parameters & plot the functions
print      ( 'real: {}, estimated: {}'.format ( [ base[0], height[0], stretch[0], t50[0] ],             res1.x ) );
#
plt.plot   (  tstart, measurements[:,0],  tstart, y(res1.x,tstart) );
plt.legend ( ('noisy', 'lsq' ) );
plt.show   ( );

The reason that I used the 4 separate parameters in the code above is that base can either be a single scalar (same for all curves per experiment) or an array (each curve has an individual base value). Is it possible to use least_squares this way? Or do all the parameters need to be put in on 1D array?

It is possible to flatten the array of parameters (like with theta) but that becomes incredibly messy because in the model where every experiment has its own base parameter, the second parameter would be a value for base, but in the model where there is one base parameter for all experiments, the second value in the array would be a value for height, and so on.

Ideally the parameters would still be separated, so that their length could vary independently. Is there an administratively traceable / aesthetically pleasing way to do this?


Solution

  • Maybe you could reduce a lot your question's length and increase readability by giving a minimal reproducible example! ;)

    If I understood well your issue:

    To avoid to handle a "messy" set of parameters, I often create two small functions such as param2minimizer and its opposite minimizer2param. These functions make the job of organizing all the parameters for the minimizer. The code is then much easier to understand.

    For example, let's say you want to optimize any number of [base,height] couples

    def param2minimizer(bases,heights):
        return np.concatenate((bases,heights))
    
    def minimizer2param(m):
        L = len(m)
        if L%2: raise ValueError("Incompatible length")
        bases = m[0:L//2]
        heights = m[L//2:]
        return bases, heights
    

    You can modify these two functions to match exactly your needs.

    Regarding your question on the number of dimensions of the minimizing variables, you may have a look at the documentation of least_squares. I think it is a better habit to systematically give a vector (1D) and not a matrix (2D), even if a specific minimizer would be 2D permissive. Some minimizers are only 1D permissive, moreover the optimization algorithms are theoretically and often numerically described on a 1D vector of variables. It is safer to use 1D and it conforms more to the usage.