Search code examples
pythonseabornjointplot

Existing Seaborn jointplot add to scatter plot part only


Is there a way to create a Seaborn Jointplot and then add additional data to the scatter plot part, but not the distributions?

Example below creates df and Jointplot on df. Then I would like to add df2 onto the scatter plot only, and still have it show in the legend.

import pandas as pd
import seaborn as sns

d = {
    'x1': [3,2,5,1,1,0],
    'y1': [1,1,2,3,0,2],
    'cat': ['a','a','a','b','b','b']
}

df = pd.DataFrame(d)


g = sns.jointplot(data=df, x='x1', y='y1', hue='cat')

d = {
    'x2': [2,0,6,0,4,1],
    'y2': [-3,-2,0,2,3,4],
    'cat': ['c','c','c','c','d','d']
}
df2 = pd.DataFrame(d)
## how can I add df2 to the scatter part of g 
## but not to the distribution
## and still have "c" and "d" in my scatter plot legend

Solution

    • Add points to the scatter plot with sns.scatterplot(..., ax=g.ax_joint) which will automatically expand the legend to include the new categories of points.
    • Assign new colors by selecting a color palette before generating the joint plot so that appropriate new colors can be selected automatically.
    import pandas as pd    # v 1.1.3
    import seaborn as sns  # v 0.11.0
    
    d = {
        'x1': [3,2,5,1,1,0],
        'y1': [1,1,2,3,0,2],
        'cat': ['a','a','a','b','b','b']
    }
    
    df = pd.DataFrame(d)
    
    # Set seaborn color palette
    sns.set_palette('bright')
    g = sns.jointplot(data=df, x='x1', y='y1', hue='cat')
    
    d = {
        'x2': [2,0,6,0,4,1],
        'y2': [-3,-2,0,2,3,4],
        'cat': ['c','c','c','c','d','d']
    }
    df2 = pd.DataFrame(d)
    
    # Extract new colors from currently selected color palette
    colors = sns.color_palette()[g.hue.nunique():][:df2['cat'].nunique()]
    
    # Plot additional points from second dataframe in scatter plot of the joint plot
    sns.scatterplot(data=df2, x='x2', y='y2', hue='cat', palette=colors, ax=g.ax_joint);
    

    jointplot