Search code examples
pythontuplesstacked-chart

How to create a stacked bar chart out of two lists: considering one is as cluster, and other one is flag


Let's say I have following code piece:

x = [1,1,1,1,2,3,3,3]  
y = [1,1,0,0,1,1,1,0]

import matplotlib.pyplot as plt from collections  
import Counter    
freqs = Counter(x)
plt.bar(freqs.keys(), freqs.values(), width=0.5)
plt.xticks(list(freqs.keys()))

I want to come with a stacked bar chart by coloring above bar chart by y values as below:

desired stacked bar chart image

How can I integrate y values into this bar chart?


Solution

  • The most straightforward stacked bar chart can be achieved using Matplotlib.

    import matplotlib.pyplot as plt
    
    dataset= [(1, 1), (1, 1), (1, 0), (1, 0), (2, 1), (3, 1), (3, 1), (3, 0)]
    
    y_1 = [len([1 for data in dataset if (data[0] == cat) and (data[1] == 1)])for cat in [1,2,3]]
    y_0 = [len([1 for data in dataset if (data[0] == cat) and (data[1] == 0)])for cat in [1,2,3]]
    
    plt.bar(x=['1','2','3'], height=y_1, label='1', bottom=y_0)
    plt.bar(x=['1','2','3'], height=y_0, label='0')
    plt.legend()
    plt.xlabel('Category')
    plt.ylabel('Frequency')
    plt.show()
    

    enter image description here

    Or if you are familiar with Pandas, you can use the built-in plotting function as well, which gives a similar plot:

    import pandas as pd
    
    dataset= [(1, 1), (1, 1), (1, 0), (1, 0), (2, 1), (3, 1), (3, 1), (3, 0)]
    
    x = [tup[0] for tup in dataset]
    y = [tup[1] for tup in dataset]
    
    df = pd.DataFrame({'x':x, 'y':y, 'freq_y':0})
    
    ax = df.groupby(['x','y']).count().unstack(1).plot(y='freq_y',kind='bar', stacked=True)
    ax.set_ylabel('Frequency')
    ax.set_xlabel('Category')
    plt.show()
    

    enter image description here