Problem Fitting a Residence Time Distribution Data

I am trying to fit Resident Time Distribution (RTD) Data. RTD is typically skewed distribution. I have built a simple code that takes this non equally space-time data set from the RTD.

Data Sett
timeArray = [0.0, 0.5, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 12.0, 14.0]
concArray = [0.0, 0.6, 1.4, 5.0, 8.0, 10.0, 8.0, 6.0, 4.0, 3.0, 2.2, 1.5, 0.6, 0.0]

To fit the data I have been using python curve_fit function

parameters, covariance = curve_fit(nCSTR, time, conc, p0=guess)

and different sets of models (ex. CSTR, Sine, Gauss) to fit the data. However, no success so far.

The RTD data that I have correspond to a CSTR and there is an equation that model very accurate this type of behavior.

#Generalize nCSTR model
y = (( (np.power(x/tau,n-1)) * np.power(n,n) ) / (tau * math.gamma(n)) ) * np.exp(-n*x/tau)

As a separate note: from the Generalized nCSTR model I am using gamma instead of (n-1)! factorial terms because of the complexities of the code trying to deal with decimal values in factorials terms.

This CSTR model should be the one fitting the data without problem but for some reason is not able to do so. The outcome after executing my code:

timeArray = [0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5,  4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10.0, 10.5, 11.0, 11.5, 12.0, 12.5, 13.0, 13.5, 14.0]
concArray = [0.0, 0.6, 1.4, 2.6, 5.0, 6.5, 8.0, 9.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.5, 3.0, 2.5, 2.2, 1.8,  1.5,  1.2,  1.0,  0.8,  0.6,  0.5,  0.3,  0.1,  0.0]

#Recast time and conc into numpy arrays
time = np.asarray(timeArray)
conc = np.asarray(concArray)
plt.plot(time, conc, 'o')


def nCSTR(x, tau, n):
    y = (( (np.power(x/tau,n-1)) * np.power(n,n) ) / (tau * math.gamma(n)) ) * np.exp(-n*x/tau)
    return y

guess = [1, 12]
parameters, covariance = curve_fit(nCSTR, time, conc, p0=guess)

tau = parameters[0]
n = parameters[1]

y = np.arange(0.0, len(time), 1.0)

for i in range(len(timeArray)):
    y[i] = (( (np.power(time[i]/tau,n-1)) * np.power(n,n) ) / (tau * math.gamma(n)) ) * np.exp(-n*time[i]/tau)

plt.plot(time,y)

is this plot Fitting Output

I know I am missing something and any help will be well appreciated. The model has been well known for decades so it should not be related to the equation. I did some dummy data to confirm that the equation is written correctly and the output was the same type of profile that I am looking for. In that end, the equestion is fine.

import numpy as np
import math

t = np.arange(0.0, 10.5, 0.5)
tau = 2
n = 5
y = np.arange(0.0, len(t), 1.0)

for i in range(len(t)):
    y[i] = (( (np.power(t[i]/tau,n-1)) * np.power(n,n) ) / (tau * math.gamma(n)) ) * np.exp(-n*t[i]/tau)

print(y)

plt.plot(t,y)

CSTR profile with Dummy Data (image)

If anyone is interested in the theory behind it I recommend any reading related to Tank In Series (specifically CSTR) Fogler has a great book about this topic.

Solution

I think that the main problem is that your model does not allow for an overall scale factor or that your data may not be normalized as you expect.

If you'll permit me to convert your curve-fitting program to use lmfit (I am a lead author), you might do:

import numpy as np
from scipy.special import gamma
import matplotlib.pyplot as plt
from lmfit import Model

timeArray = [0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5,  4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10.0, 10.5, 11.0, 11.5, 12.0, 12.5, 13.0, 13.5, 14.0]
concArray = [0.0, 0.6, 1.4, 2.6, 5.0, 6.5, 8.0, 9.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.5, 3.0, 2.5, 2.2, 1.8,  1.5,  1.2,  1.0,  0.8,  0.6,  0.5,  0.3,  0.1,  0.0]

#Recast time and conc into numpy arrays
time = np.asarray(timeArray)
conc = np.asarray(concArray)

plt.plot(time, conc, 'o', label='data')
    
def nCSTR(x, scale,  tau, n):
    """scaled CSTR model"""
    z = n*x/tau
    return scale * np.exp(-z) * z**(n-1) * (n/(tau*gamma(n)))

# create a Model for your model function
cmodel = Model(nCSTR)

# now create a set of Parameters for your model (note that parameters
# are named using your function arguments), and give initial values
params = cmodel.make_params(tau=3, scale=10, n=10)

# since you have `xxx**(n-1)`, setting a lower bound of 1 on `n`
# is wise, otherwise you would have to handle complex values
params['n'].min = 1

# now fit the model to your `conc` data with those parameters
# (and also passing in independent variables using `x`: the argument
# name from the signature of the model function)
result = cmodel.fit(conc, params, x=time)

# print out a report of the results
print(result.fit_report())

# you do not need to construct the best fit yourself, it is in `result`:
plt.plot(time, result.best_fit, label='fit')    
plt.legend()
plt.show()

This will print out a report that includes statistics and uncertainties:

[[Model]]
    Model(nCSTR)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 29
    # data points      = 29
    # variables        = 3
    chi-square         = 2.84348862
    reduced chi-square = 0.10936495
    Akaike info crit   = -61.3456602
    Bayesian info crit = -57.2437727
    R-squared          = 0.98989860
[[Variables]]
    scale:  49.7615649 +/- 0.81616118 (1.64%) (init = 10)
    tau:    5.06327482 +/- 0.05267918 (1.04%) (init = 3)
    n:      4.33771512 +/- 0.14012112 (3.23%) (init = 10)
[[Correlations]] (unreported correlations are < 0.100)
    C(scale, n)   = -0.521
    C(scale, tau) = 0.477
    C(tau, n)     = -0.406

and generate a plot of