Search code examples
seabornhuecatplot

exclude one of the hue from seaborn catplot visualization


I want to visualize category count by seaborn catplot but one of the hue are not important and don't need to include the visualization. How can I select specific Hues at catplot to visualize without changing or removing any value from the column ?


Solution

  • You could remove the rows with that value from the dataframe. If the column is Categorical you might also need to change the categories as the legend will still contain all the categories.

    Here is an example:

    import seaborn as sns
    import pandas as pd
    
    tips = sns.load_dataset('tips')
    tips['day'].dtype # CategoricalDtype(categories=['Thur', 'Fri', 'Sat', 'Sun'], ordered=False)
    # create a subset, a copy is needed to be able to change the categorical column
    tips_weekend = tips[tips['day'].isin(['Sat', 'Sun'])].copy()
    tips_weekend['day'].dtype # CategoricalDtype(categories=['Thur', 'Fri', 'Sat', 'Sun'], ordered=False)
    tips_weekend['day'] = pd.Categorical(tips_weekend['day'], ['Sat', 'Sun'])
    tips_weekend['day'].dtype # CategoricalDtype(categories=['Sat', 'Sun'], ordered=False)
    sns.catplot(data=tips_weekend, x='smoker', y='tip', hue='day')
    

    catplot with reduced hue levels

    For the follow-up question, a histplot with multiple='fill' can show the percentage distribution:

    import seaborn as sns
    import pandas as pd
    from matplotlib.ticker import PercentFormatter
    
    tips = sns.load_dataset('tips')
    tips_weekend = tips.copy()
    tips_weekend['day'] = tips_weekend['day'].apply(lambda x: x if x in ['Sat', 'Sun'] else 'other')
    # fix a new order
    tips_weekend['day'] = pd.Categorical(tips_weekend['day'], ['other', 'Sat', 'Sun'])
    
    ax = sns.histplot(data=tips_weekend, x='smoker', hue='day', stat='count', multiple='fill',
                      palette=['none', 'turquoise', 'crimson'])
    # remove the first label ('other') in the legend
    ax.legend(handles=ax.legend_.legendHandles[1:], labels=['Sat', 'Sun'], title='day')
    ax.yaxis.set_major_formatter(PercentFormatter(1))
    # add percentages
    for bar_group in ax.containers[:-1]:
        ax.bar_label(bar_group, label_type='center', labels=[f'{bar.get_height() * 100:.1f} %' for bar in bar_group])
    

    seaborn showing percentage distribution