Search code examples
pythonmatplotlibseabornbar-chartfacet-grid

Understanding FacetGrid/Barplot Inconsistencies


I was doing some EDA, and I observed the following behavior with Seaborn.

Seaborn version: 0.12.2

Matplotlib version: 3.7.1

Input data

import pandas as pd
import seaborn as sns

data = {'Class': [0, 1, 1, 1, 1, 0, 1, 0, 1],
        'count': [509, 61, 18, 29, 8, 148, 54, 361, 46],
        'greek_char': ['Alpha', 'Alpha', 'Alpha', 'Alpha', 'Beta', 'Beta', 'Beta', 'Beta', 'Beta'],
        'value': ['A', 'B', 'D', 'G', 'A', 'B', 'B', 'C', 'C']}

df = pd.DataFrame(data)

Code

fig = sns.FacetGrid(data=df, col="greek_char", hue="Class")

\_ = fig.map_dataframe(sns.barplot, x="value", y="count", dodge=True)

I obtained the following graph:

Here are some inconsistencies:

  • Notice that Alpha doesn't have C in the dataset, but it appears in the graph.

  • Alpha A has only Class 0, however, I see both classes in the graph.

  • Class G and D are missing in the graph.

I would appreciate any help in determining whether this behavior is a bug, expected behavior, or if I am missing something.


Solution

  • If you try running your code using fig.map instead of fig.map_dataframe, you'll get the warning, UserWarning: Using the barplot function without specifying 'order' is likely to produce an incorrect plot. Once I add the order argument, I get the correct plot.

    import pandas as pd
    import seaborn as sns
    
    data = {"Class":[0, 1, 1, 1, 1, 0, 1, 0, 1],
            "count":[509, 61, 18, 29, 8, 148, 54, 361, 46],
            "greek_char":["Alpha"]*4 + ["Beta"]*5,
            "value":["A", "B", "D", "G", "A", "B", "B", "C", "C"]}
    
    df = pd.DataFrame(data)
    
    fig = sns.FacetGrid(data=df, col="greek_char", hue="Class")
    fig = fig.map_dataframe(sns.barplot, 
                            x="value", 
                            y="count", 
                            order=sorted(df["value"].unique()))
    fig.add_legend()