Search code examples
counthistogramdensity-plot

Counts, bars, bins for each pandas DataFrame histogram subplot


I am making separate histograms of travel distance per departure hour. However, for making further calculations I'd like to have the value of each bin in a histogram, for all histograms.

Up until now, I have the following:

    df['Distance'].hist(by=df['Departuretime'], color = 'red', 
            edgecolor = 'black',figsize=(15,15),sharex=True,density=True)

This creates in my case a figure with 21 small histograms.

With single histograms, I'd paste counts, bins, bars = in front of the entire line and the variable counts would contain the data I was looking for, however, in this case it does not work.

Ideally I'd like a dataframe or list of some sort for each histogram, containing the density values of the bins. I hope someone can help me out! Thanks in advance!

Edit:

Data I'm using, about 2500 columns of this, Distance is float64, the Departuretime is str

Histogram output I'm receiving

Of all these histograms I want to know the y-axis value of each bar, preferably in a dataframe with the distance binning as rows and the hours as columns


Solution

  • By using the 'cut' function you can withdraw the requested data directly from your dataframe, instead of from the graph. This is less error-sensitive.

    df['DistanceBin'] = pd.cut(df['Distance'], bins=10)
    

    Then, you can use pivot_table to obtain a table with the counts for each combination of DistanceBin and Departuretime as rows and columns respectively as you asked.

    df.pivot_table(index='DistanceBin', columns='Departuretime', aggfunc='count')