I have a histogram from measured data and I want to find an envelope (a continuous function) of this histogram. What do you suggest? How to do it in python?
plot_histogram_of_real_data(file_name='/home/me/data.txt'):
plt.figure('Histogram of real data')
data = load_measured_data(file_name)
n, bins, patches = plt.hist(data, 30, facecolor='green', alpha=0.75)
plt.grid()
plt.show()
You can either fit the data that you get from a histogram using one of several ways:
numpy.polyfit
for polynomial fits (https://numpy.org/doc/stable/reference/generated/numpy.polyfit.html)scipy.optimize.curve_fit
for fitting arbitrary functionsThere is also kernel density approximation: scipy.stats.gaussian_kde
which is a standard representation for most statsiticians.
In seaborn
, you can plot sns.kdeplot
for a single set of data, and sns.violinplot
for multiple sets of data. For data which may vary significantly, I would suggest using the Kernel density estimates, rather than fitting some function of your own from histograms.