Search code examples
pythonpandasseabornvisualizationlegend

How to display only Categories in the legend present in the data


I have a data frame as below:

enter image description here

In the above dataframe, 'Month' is an ordered Categorical column defined as:

cats = ['January', 'February', 'March', 'April','May','June', 'July', 'August','September', 'October', 'November', 'December']
month_gr['Month'] = pd.Categorical(month_gr['Month'], cats, ordered = True)

using Seaborn barplot:

ax = sns.barplot(data = month_gr, x = 'Item Name', y = 'Total', hue = 'Month')
ax.set_xticklabels(ax.get_xticklabels(), rotation= 90, ha = 'right')

Outputs: enter image description here

The legend above displays all 12 months of the Categorical column. I want to display the legend for only 4 months ['June', 'July', 'August', 'September'], as my data contains only these 4 months. Is there a way to dynamically control the legend so it displays only the available Categories passed to data?


Solution

  • You could create a list of "used months" and then set that list as hue_order. This also ensures that only those months will take up space for the bars.

    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns
    
    month_col = ['June'] * 5 + ['July'] * 5 + ['August'] * 5 + ['September'] * 7
    month_gr = pd.DataFrame({'Month': month_col,
                             'Item Name': [*'abcdebdefgbcefgabcdefg'],
                             'Total': np.random.randint(100, 1000, len(month_col))})
    
    cats = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
    month_gr['Month'] = pd.Categorical(month_gr['Month'], cats, ordered=True)
    
    used_months = [m for m in cats if m in month_gr['Month'].to_list()]
    
    ax = sns.barplot(data=month_gr, x='Item Name', y='Total',
                     hue='Month', hue_order=used_months, palette=sns.color_palette("Set2"))
    plt.show()
    

    example plot