I would like to fit a line that uses the inverse of the error bars as weights.
I binned my data x and y, into 10 bins ( ~26 points each ) and took their mean. This is not a matrix of values so polyfit isn't happy with that.
# note I didn't test this pseudo code....
import numpy as np
import scipy
x = np.random.randn(100)
y = np.random.rand(100)
x_bins = np.linspace(x.min(), x.max(), 10) # reduction of data into 10 bins
y_bins = np.linspace(y.min(), y.max(), 10)
x_bin = np.digitize(x, x_bins)
y_bin = np.digitize(y, y_bins)
x_mu = np.zeros(10)
y_mu = x_mu.copy()
err = x_mu.copy()
for i in range(10):
x_mu = np.mean(x[x_bin==i])
y_mu = np.mean(y[y_bin==i])
err = np.std([y[y_bin==i])
x_mu[np.isnan(x_mu)] = 0
y_mu[np.isnan(y_mu)] = 0
errror[np.isnan(error)] = 0
plt.errorbar(x_mu, y_mu, err, fmt='o')
EDIT: scipy.polyfit stopped complaining about ill conditioned inputs...
out = scipy.polyfit(x_mu, y_mu, deg=1, w=error)
A numpy.polyfit
does not allow you to explicitly specify uncertainties. Instead you could use scipy.optimize.curve_fit
, e.g.
import numpy as np
import scipy
import scipy.optimize
x = np.linspace(0,1, 100)
y = np.random.rand(100)
# bin the data
n, bins = np.histogram(y, 10, [0, 1])
xb = bins[:-1] + 0.05 # at bin center; has overflow bin
yb = n # just the per-bin counts
err = sqrt(n) # from Poisson statistics
plt.errorbar(xb, yb, err, fmt='ro')
# fit a polynomial of degree 1, no explicit uncertainty
a1, b1 = np.polyfit(xb, yb, 1)
plt.plot(xb, a1*xb + b1, 'b')
# fit explicitly taking uncertainty into account
f = lambda x, a, b: a*x + b # function to fit
# fit with initial guess for parameters [1, 1]
pars, corr = scipy.optimize.curve_fit(f, xb, yb, [1, 1], err)
a2, b2 = pars
plt.plot(xb, a2*xb + b2, 'r')
To properly interpret the fit you need to examine the correlation matrix of the fitted parameters, but that goes beyond this technical question.