Search code examples
pythonpandasmatplotlibhistogrambinning

Plotting already sorted histogram data stored in a panda dataframe


I have a .csv file with histogram data in it, already binned & normalised which i read into a panda dataframe df:

Freq
0.4
0.0
0.0
0.0
0.01
0.05
0.1
0.04
0.05
0.05
0.02
0.08
0.10
0.03
0.07

I would like to plot this in a cumulative distribution histogram using matplotlib, but the pyplot.hist sorts the data and bins it again - which is not what I want.

plt.hist(df.loc[(data_tor['Freq'], cumulative = True)

Can anyone tell me how to do this?


Solution

  • You can use:

    df['Freq'].cumsum().plot(drawstyle='steps')
    

    enter image description here

    And to fill under the curve:

    ax = df['Freq'].cumsum().plot(drawstyle='steps')
    ax.fill_between(df.index, 0, df['Freq'].cumsum(), step="pre")
    

    enter image description here