Search code examples
pythonseabornvisualizationdistribution

How to have relative frequencies histograms in seaborn jointplot


I'd like to make a jointplot to compare the distributions of 2 conditions, but one of the conditions has much fewer cases, so its histograms is not visible on the x and y margins. I tried to get normalized histograms per condition with:

sns.jointplot(x=var_x, y=var_y, data=df, kind="kde", hue="condition", alpha=0.7, joint_kws={'norm_hist':True})

I also tried norm_hist={'norm_hist':True} and normalize instead of norm_hist, but it didn't work. I've seen this post about distplot, but the argument norm_hist=True doesn't work for joinplot. I had looked at the source code but there is too much abstraction for me to see how I could tweak it to get normalized histogram.
Would you have any idea of how to get that results?
Thanks!


Solution

  • sns.jointplot(..., kind='kde') uses sns.kdeplot both for the central ("joint") and the marginal subplots. You can set common_norm=False to either or both of them.

    Also note that distplot is an old function; seaborn's interface has been cleaned up and extended.

    Here is an example:

    import seaborn as sns
    
    penguins = sns.load_dataset('penguins')[:230]
    
    sns.jointplot(data=penguins, x="bill_length_mm", y="bill_depth_mm", hue="species", kind="kde",
                  joint_kws={'common_norm': False}, marginal_kws={'common_norm': False})
    

    sns.jointplot with common_norm=False penguins.value_counts('species') shows the uneven counts:

    species
    Adelie       152
    Chinstrap     68
    Gentoo        10