I know how to plot a histogram when individual datapoints are given like: (33, 45, 54, 33, 21, 29, 15, ...)
by simply using something matplotlib.pyplot.hist(x, bins=10)
but what if I only have grouped data like:
I know that I can use bar plots to mimic a histogram by changing xticks
but what if I want to do this by using only hist
function of matplotlib.pyplot
?
Is it possible to do this?
You can build the hist()
params manually and use the existing value counts as weights
.
Say you have this df
:
>>> df = pd.DataFrame({'Marks': ['0-10', '10-20', '20-30', '30-40'], 'Number of students': [8, 12, 24, 26]})
Marks Number of students
0 0-10 8
1 10-20 12
2 20-30 24
3 30-40 26
The bins
are all the unique boundary values in Marks
:
>>> bins = pd.unique(df.Marks.str.split('-', expand=True).astype(int).values.ravel())
array([ 0, 10, 20, 30, 40])
Choose one x
value per bin, e.g. the left edge to make it easy:
>>> x = bins[:-1]
array([ 0, 10, 20, 30])
Use the existing value counts (Number of students
) as weights
:
>>> weights = df['Number of students'].values
array([ 8, 12, 24, 26])
Then plug these into hist()
:
>>> plt.hist(x=x, bins=bins, weights=weights)