Search code examples
pythonpandasmatplotlibhistogrambins

How to arrange bins in stacked histogram, Python


I am working on a code of a stacked histogram and I need help arranging the bins in the order if this is possible.

0.01 - 0.1, 0.1 - 0.5, 0.5 - 1.0, 1.0 - 2.5, > 2.5

Right now, my histogram looks like this:

Histogram

with the order of bins being:

0.01 - 0.1, 1.0 - 2.5, > 2.5, 0.1 - 0.5, 0.5 - 1.0

Code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

data = [['0.01 - 0.1','A'],['0.1 - 0.5','B'],['0.5 - 1.0','B'],['0.01 - 0.1','C'],['> 2.5','A'],['1.0 - 2.5','A'],['> 2.5','A']]

df = pd.DataFrame(data, columns = ['Size','Index'])

### HISTOGRAM OF SIZE

df_new = df.sort_values(['Size'])

x_var = 'Size'
groupby_var = 'Index'
df_new_agg = df_new.loc[:, [x_var, groupby_var]].groupby(groupby_var)
vals = [df_new[x_var].values.tolist() for i, df_new in df_new_agg]

list_of_colors_element = ['lightcoral','palegreen','forestgreen']

# Draw
plt.figure(figsize=(16,10), dpi= 80)
colors = [plt.cm.Spectral(i/float(len(vals)-1)) for i in range(len(vals))]
n, bins, patches = plt.hist(vals, df_new[x_var].unique().__len__(), stacked=True, density=False, color=list_of_colors_element)

# Decorations
plt.legend({group:col for group, col in zip(np.unique(df_new[groupby_var]).tolist(), list_of_colors_element)}, prop={'size': 16})
plt.title("Stacked Histogram of Size colored by element of highest share", fontsize=22)
plt.xlabel(x_var, fontsize=22)
plt.ylabel("Frequency", fontsize=22)
plt.grid(color='black', linestyle='--', linewidth=0.4)
plt.xticks(range(5),fontsize=15)
plt.yticks(fontsize=15)

plt.show()

Any help is appreciated!


Solution

  • You can use:

    piv = df_new.assign(dummy=1) \
                .pivot_table('dummy', 'Size', 'Index', aggfunc='count', fill_value=0) \
                .rename_axis(columns=None)
    ax = piv.plot.bar(stacked=True, color=list_of_colors_element, rot=0, width=1)
    plt.show()
    

    enter image description here