Search code examples
pythonpython-3.xseabornfacet-griddisplot

Plot multiple distplot in seaborn Facetgrid


I have a dataframe which looks like below:

df:

RY         MAJ_CAT                  Value
2016    Cause Unknown              0.00227
2016    Vegetation                 0.04217
2016    Vegetation                 0.04393
2016    Vegetation                 0.07878
2016    Defective Equip            0.00137
2018    Cause Unknown              0.00484
2018    Defective Equip            0.01546
2020    Defective Equip            0.05169
2020    Defective Equip            0.00515
2020    Cause Unknown              0.00050

I want to plot the distribution of the value over the given years. So I used distplot of seaborn by using following code:

year_2016 = df[df['RY']==2016]
year_2018 = df[df['RY']==2018]
year_2020 = df[df['RY']==2020]
sns.distplot(year_2016['value'].values, hist=False,rug=True)    
sns.distplot(year_2018['value'].values, hist=False,rug=True)   
sns.distplot(year_2020['value'].values, hist=False,rug=True)

In the next step I want to plot the same value distribution over the given year w.r.t MAJ_CAT. So I decided to use Facetgrid of seaborn, below is the code :

g = sns.FacetGrid(df,col='MAJ_CAT')
g = g.map(sns.distplot,df[df['RY']==2016]['value'].values, hist=False,rug=True))    
g = g.map(sns.distplot,df[df['RY']==2018]['value'].values, hist=False,rug=True))    
g = g.map(sns.distplot,df[df['RY']==2020]['value'].values, hist=False,rug=True))

However, when it ran the above command, it throws the following error:

 KeyError: "None of [Index([(0.00227, 0.04217, 0.043930000000000004, 0.07877999999999999, 0.00137, 0.0018800000000000002, 0.00202, 0.00627, 0.00101, 0.07167000000000001, 0.01965, 0.02775, 0.00298, 0.00337, 0.00088, 0.04049, 0.01957, 0.01012, 0.12065, 0.23699, 0.03639, 0.00137, 0.03244, 0.00441, 0.06748, 0.00035, 0.0066099999999999996, 0.00302, 0.015619999999999998, 0.01571, 0.0018399999999999998, 0.03425, 0.08046, 0.01695, 0.02416, 0.08975, 0.0018800000000000002, 0.14743, 0.06366000000000001, 0.04378, 0.043, 0.02997, 0.0001, 0.22799, 0.00611, 0.13960999999999998, 0.38871, 0.018430000000000002, 0.053239999999999996, 0.06702999999999999, 0.14103, 0.022719999999999997, 0.011890000000000001, 0.00186, 0.00049, 0.13947, 0.0067, 0.00503, 0.00242, 0.00137, 0.00266, 0.38638, 0.24068, 0.0165, 0.54847, 1.02545, 0.01889, 0.32750999999999997, 0.22526, 0.24516, 0.12791, 0.00063, 0.0005200000000000001, 0.00921, 0.07665, 0.00116, 0.01042, 0.27046, 0.03501, 0.03159, 0.46748999999999996, 0.022090000000000002, 2.2972799999999998, 0.69021, 0.22529000000000002, 0.00147, 0.1102, 0.03234, 0.05799, 0.11744, 0.00896, 0.09556, 0.03202, 0.01347, 0.00923, 0.0034200000000000003, 0.041530000000000004, 0.04848, 0.00062, 0.0031100000000000004, ...)], dtype='object')] are in the [columns]"

I am not sure where am I making the mistake. Could anyone please help me in fixing the issue?


Solution

  • setup the dataframe

    import pandas as pd
    import numpy as np
    import seaborn as sns
    
    # setup dataframe of synthetic data
    np.random.seed(365)
    data = {'RY': np.random.choice([2016, 2018, 2020], size=400),
            'MAJ_CAT': np.random.choice(['Cause Unknown', 'Vegetation', 'Defective Equip'], size=400),
            'Value': np.random.random(size=400) }
    
    df = pd.DataFrame(data)
    

    Updated Answer

    • From seaborn v0.11
    • Use sns.displot with kind='kde' and rug=True
      • Is a figure-level interface for drawing distribution plots onto a FacetGrid.

    Plotting all 'MAJ_CAT' together

    sns.displot(data=df, x='Value', hue='RY', kind='kde', palette='tab10', rug=True)
    

    enter image description here

    Plotting 'MAJ_CAT' separately

    sns.displot(data=df, col='MAJ_CAT', x='Value', hue='RY', kind='kde', palette='tab10', rug=True)
    

    enter image description here

    Original Answer

    • In seaborn v0.11, distplot is deprecated

    distplot

    • Consolidate the original code to generate the distplot
    for year in df.RY.unique():
        values = df.Value[df.RY == year]
        sns.distplot(values, hist=False, rug=True)
    

    enter image description here

    facetgrid

    • properly configure the mapping and add hue to FacetGrid
    g = sns.FacetGrid(df, col='MAJ_CAT', hue='RY')
    p1 = g.map(sns.distplot, 'Value', hist=False, rug=True).add_legend()
    

    enter image description here