Search code examples
pandasseabornsubplotfacet-gridjointplot

Joint plot for groupby datas on seaborn


I have a dataframe that looks like this:

In[1]: df.head()
Out[1]:
dataset  x     y
   1     56   45
   1     31   67
   7     22   85
   2     90   45
   2     15   42

There are about 4000 more rows. x and y is grouped by the datasets. I am trying to plot a jointplot for each dataset separately using seaborn. This is what I can come up so far:

import seaborn as sns

g = sns.FacetGrid(df, col="dataset", col_wrap=3)
g.map_dataframe(sns.scatterplot, x="x", y="y", color = "#7db4a2")
g.map_dataframe(sns.histplot, x="x", color = "#7db4a2")
g.map_dataframe(sns.histplot, y="y", color = "#7db4a2")
g.add_legend();

but there are all overlapped. How do I make a proper jointplot for each dataset in a subplot? Thank you in advanced and cheers!


Solution

  • You can use groupby on your dataset column, then use sns.jointgrid(), and then finally add your scatter plot and KDE plot to the jointgrid.

    Here is an example using a random seed generator with numpy. I made three "datasets" and random x,y values. See the Seaborn jointgrid documentation for ways to customize colors, etc.

    ### Build an example dataset
    np.random.seed(seed=1)
    ds = (np.arange(3)).tolist()*10
    x = np.random.randint(100, size=(60)).tolist()
    y = np.random.randint(20, size=(60)).tolist()
    df = pd.DataFrame(data=zip(ds, x, y), columns=["ds", "x", "y"])
    
    ### The plots
    for _ds, group in df.groupby('ds'):
        group = group.copy()
        g = sns.JointGrid(data=group, x='x', y='y')
        g.plot(sns.scatterplot, sns.kdeplot)
    

    enter image description here