Fit a custom function in python

I'm trying to fit my data with the following function:

The data that I'm using is the following:

X1: 
0        1.0
1      101.0
2      201.0
3      301.0
4      401.0
5      501.0
6      601.0
7      701.0
8      801.0
9      901.0
10    1001.0
11    1101.0
12    1201.0
13    1301.0
14    1401.0
15    1501.0
16    1601.0
17    1701.0
18    1801.0
19    1901.0

Y1: 
0     0.121159
1     0.195525
2     0.167305
3     0.125499
4     0.094138
5     0.071610
6     0.053848
7     0.039890
8     0.031099
9     0.023976
10    0.018271
11    0.013807
12    0.010596
13    0.008033
14    0.006710
15    0.005222
16    0.004299
17    0.003376
18    0.002976
19    0.002659

My code to call the function looks as follows:

def logN(X1, mu, SD1):
  return A / X1 * np.exp(-0.5 * (np.log(X1 / mu) ** 2 / np.log(SD1) ** 2))

params, pcov = curve_fit(logN, X1, Y1)
print(params)
plt.plot(X1, Y1, "o")
plt.plot(X1, logN(X1, params[0], params[1]))
plt.show()

The result of this function shows parameters equal to 1 and I got the following warning:

minpack.py:829: OptimizeWarning: Covariance of the parameters could not be estimated

I'm wondering if I'm calling the function properly or the syntax of my function is wrong. Some ideas?

Solution

Observations

You are facing multiple challenges:

Stated as you did, your problem is a non linear regression (in term of coefficients) which can be solved using non linear algorithm such as Levenberg Marquardt (implemented in scipy.optimize.curve_fit)
You are not considering the A coefficient in optimization process but it is explicitly stated in your function (thus it takes a global value that is not detailed in your post) and this A coefficient is related to sigma as the former incorporate the later.
Some of your data does not suit a log normal distribution (point at x=1 seems suspect) and there is no estimation of y uncertainty. This may prevent proper convergence when performing parameters optimization and then algorithm fails to compute the covariance matrix.

Improvement proposal:

It is possible to rewrite your problem into a classical OLS involving a second order polynomial. Then we do not have to rely on NLLS algorithms. Just apply the log-log transform to your relation to confirm it is bearable and get parameters conversion formula. Always prefer OLS before NLLS if available.
Remove or penalize (weights) suspect points, ideally using an objective criterion.
Adapt your model function (not considered here).

MCVE

Based on data you provided:

import io
import numpy as np
from scipy import optimize
import pandas as pd
import matplotlib.pyplot as plt

data = io.StringIO("""id;x;y;sy
0;1.0;0.121159;1
1;101.0;0.195525;1
2;201.0;0.167305;1
3;301.0;0.125499;1
4;401.0;0.094138;1
5;501.0;0.071610;1
6;601.0;0.053848;1
7;701.0;0.039890;1
8;801.0;0.031099;1
9;901.0;0.023976;1
10;1001.0;0.018271;1
11;1101.0;0.013807;1
12;1201.0;0.010596;1
13;1301.0;0.008033;1
14;1401.0;0.006710;1
15;1501.0;0.005222;1
16;1601.0;0.004299;1
17;1701.0;0.003376;1
18;1801.0;0.002976;1
19;1901.0;0.002659;1
""")
df = pd.read_csv(data, sep=";", index_col="id")

Rewriting your model function to:

def func(x, A, mu, sigma):
    return (A/x)*np.exp(-((np.log(x/mu)/np.log(sigma))**2)/2)

Modified signature

Then we can naively fit the function by providing data and smart enough initial condition to the optimization algorithm:

popt, pcov = optimize.curve_fit(func, df.x, df.y, sigma=df.sy,
                                p0=(50, 100, 0.1), method="lm")

But the result is not very satisfactory (unweighted):

And prone to variation because of suspect points (penalizing x=1 with w=100):

Thus having uncertainty on y measurements could help to adjust the fit.

Anyway as the problem can be linearized we should rely on this property, weights can be used in OLS as well.

Linearization

You can perform OLS with scipy.optimize.least_squares if you wish. I will use the sklearn framework which is very handy:

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

Lets remove the first suspect point:

df = df.loc[1:,:]

Then, we adapt inputs and perform log-log transformation:

X = np.log(df.x.values).reshape(-1, 1)
y = np.log(df.y)

We create the OLS pipeline for a second order polynomial:

poly = PolynomialFeatures(2)
linreg = LinearRegression()
model = make_pipeline(poly, linreg)

And finally we adjust the model to data:

model.fit(X, y)
model.score(X, y) # 0.9982242621455882

It leads to:

Which seems a reasonable adjustment for a quadratic. Then it is just about converting back coefficients into your desired quantities.