Search code examples
pythonplotlogistic-regressiondata-fittingsigmoid

I am trying to fit a logistic sigmoid function to data. The data in the y-axis is reversed to show length change over x-axis progression. T


I have many datasets which start at a higher y-axis value, then decrease. This happens in traditional S-curve fashion, so it actually looks like a reverse sigmoid. I successfully flipped the y-data so that the maximum value is at the bottom in the y-axis, and it decreases upward. So, data plots fine. The curve now looks like a standard sigmoid.

The Big Issue: I cannot fit the sigmoid to it. I have done everything I can think of for a few days.

The best result I have is in the attached image. I have many fits that just look incomplete and nowhere close to what a sigmoid fit should be.

Example of best result it plots: (https://i.sstatic.net/NPVry.png)

The code I have:

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

# 1. Data set: 
x_data = np.array([130, 125, 120, 115, 110, 105, 100, 95, 90, 85, 80, 75, 70, 65, 60, 55, 50, 45, 40,     
35, 30, 25])
y_data = np.array([81, 81, 80.5, 79, 77.5, 76, 74.5, 71.5, 68.5, 67, 64, 59.5, 55, 48.5, 44, 42.5, 42,     
41.5, 41.5, 41, 41, 41])

# 2. Reverse the y-values: 
y_data_reversed = np.flip(y_data)
y_range = np.linspace(41, 61.75, 81)

# 3. Sigmoid function defined: 
def sigmoid(x, U, k, a, b): 
    b = 81 #-the maxmimum y_data value. 
    return (U / (1 + np.exp(-k * (x - 77.5))))**a + b

p0 = [np.max(y_data), np.median(x_data), 1, np.min(y_data)] #-as an initial guess for the fit     
calculation. 
popt, pcov = curve_fit(sigmoid, x_data, y_data, p0, maxfev = 10000) #method = 'lm')

# 4. Sigmoid fit to data: 
x_fit = np.linspace(25, 77.5, 130) #-min(x_data), np.median(x_data), max(x_data)). 
y_fit = sigmoid(x_fit, *popt) 

# 5. Plot data and sigmoid fit:
plt.plot(x_data, y_data_reversed, marker='o', linestyle='', color='b', label='Data')
plt.plot(x_fit, y_fit, color='r', label='Sigmoid Fit')
plt.xlabel('Temperature (°C)')
plt.ylabel('Length (mm)')
plt.title('Sigmoid Fit for Contraction Data')
plt.gca().invert_yaxis()  # ...Invert y-axis to start with highest value at the top
plt.legend()
plt.grid(True)
plt.show() 

Solution

  • In general, making an exponent a decision-variable, as in the a argument in your sigmoid function, is dangerous because you can easily cause numerical over/underflow or even generate complex numbers depending on what values the optimizer chooses for the parameters.

    Your b parameter is redundant because you set b to a fixed value inside the sigmoid function. Was that intentional?

    Here's a simplified model I fitted to your data without the a parameter.

    I didn't flip the y-data and the y-axis of the plot as it is just a source of confusion. Instead, I put a minus sign inside the sigmoid function.

    import numpy as np
    import matplotlib.pyplot as plt
    from scipy.optimize import curve_fit
    
    # 1. Data set: 
    x_data = np.array([
        130, 125, 120, 115, 110, 105, 100, 95, 90, 85,
        80, 75, 70, 65, 60, 55, 50, 45, 40, 35, 30, 25
    ])
    y_data = np.array([
        81, 81, 80.5, 79, 77.5, 76, 74.5, 71.5, 68.5, 67, 
        64, 59.5, 55, 48.5, 44, 42.5, 42, 41.5, 41.5, 41, 41, 41
    ])
    
    # 3. Sigmoid function defined: 
    def sigmoid(x, U, k, b):
        a = 1 
        return -(U / (1 + np.exp(-k * (x - 77.5))))**a + b
    
    p0 = [40, -0.1, 81]
    x_fit = np.linspace(25, 130, 101)
    y_init = sigmoid(x_fit, *p0)
    
    popt, pcov = curve_fit(sigmoid, x_data, y_data, p0, maxfev = 10000)
    y_est = sigmoid(x_data, *popt)
    mse = np.mean((y_data - y_est) ** 2)
    print(f"Mean-squared error: {mse:.3f}")
    
    # 4. Sigmoid fit to data:
    y_fit = sigmoid(x_fit, *popt)
    
    # 5. Plot data and sigmoid fit:
    plt.plot(x_data, y_data, marker='o', linestyle='', color='b', label='Data')
    plt.plot(x_fit, y_init, color='r', label='Initial guess')
    plt.plot(x_fit, y_fit, color='g', label='Sigmoid Fit')
    plt.xlabel('Temperature (°C)')
    plt.ylabel('Length (mm)')
    plt.title('Sigmoid Fit for Contraction Data')
    plt.legend()
    plt.grid(True)
    plt.show()
    

    Output:

    Mean-squared error: 1.679
    

    enter image description here

    Additional Result

    You can try including the a parameter and see what happens. I managed to get a good fit with the following 5-parameter model:

    def sigmoid(x, U, k, a, b, x0):
        return -((U / (1 + np.exp(-k * (x - x0))))**a) + b
    
    p0 = [40, -0.1, 1.0, 81, 77.5]
    

    Output:

    Mean-squared error: 0.227
    

    enter image description here But use some caution: In this case U was equal to 4.59693084e+19 and a was 85.8494729 !!! This model will probably not behave as expected when extrapolating outside the region of the data you have...

    A better strategy is probably to set the a parameter manually as in the previous code and run the optimization a few times with different a values until you get a reasonable solution.