How do I plot percentile graph with interval data?
See the code below to calculate percentiles of data based on specific intervals.
idx = pd.IntervalIndex.from_breaks([39.9, 42.9,45.9,48.9,51.9,54.9,57.9])
df = pd.DataFrame({"Bin": idx, "Frequency": [2,2,5,5,12,3]})
n = df["Frequency"].sum()
df['cumulativeSumFreq'] = df["Frequency"].cumsum()
df['cumulativePercent'] = (df["Frequency"]/n)*100
df
bins = [39.9, 42.9,45.9,48.9,51.9,54.9,57.9]
df.hist(column='cumulativePercent', bins=bins)
plt.show()
For some reason df.hist() does not except bins=idx
I get the following plot below which does not follow the correct binning. How would I achieve this?
Use pyplot.stairs
or Axes.stairs
:
edges = [39.9, 42.9,45.9,48.9,51.9,54.9,57.9]
plt.stairs(df['cumulativePercent'], edges, fill=True)
or
edges = [39.9, 42.9,45.9,48.9,51.9,54.9,57.9]
fig, ax = plt.subplots()
ax.stairs(df['cumulativePercent'], edges, fill=True)
Output: