Search code examples
pythonmatplotlibseaborndisplot

Curve the Kernel Density Estimate (KDE) in seaborn displot


When I try to plot my data in the form of histogram using seaborn displot:

plot = sns.displot(
    data=z, kde=True, kind="hist", bins=3000, legend=True, aspect=1.8
).set(title='Error Distribution')

The curve for KDE is plotted in the form of straight lines instead of curves like here: Error Distribution Is there a way to make the KDE lines cover all the bins of the histogram in a curved manner?


Solution

  • Instead of zooming in, you could use the bins to restrict to a certain range (via binrange=...). To limit the range of the kde, you can use the clip keyword. Here is an example, first without setting the range:

    from matplotlib import pyplot as plt
    import seaborn as sns
    import pandas as pd
    import numpy as np
    
    # first, create some test data
    slatm = np.random.normal(-.9, .4, size=(10000, 10)).max(axis=1)
    split = np.random.normal(-.1, .1, size=(10000, 10)).max(axis=1)
    split[0] = 200  # ad an extreme far value to the dataset
    z = pd.DataFrame({'slatm': slatm, 'split': split})
    
    g = sns.displot(data=z, kde=True, kind="hist", bins=3000, legend=True, aspect=1.8)
    g.set(title='Error Distribution')
    g.ax.set_xlim(-1, 0.5) # zoom in via the x limits
    

    displot with zooming in

    Here is how it would look with limiting the ranges for the histogram and the kde:

    min_x, max_x = -1, 0.5
    g = sns.displot(data=z, kde=True, kind="hist", bins=30, binrange=(min_x, max_x), legend=True, aspect=1.8,
                    kde_kws={'clip': (min_x, max_x)})
    g.set(title='Error Distribution')
    

    sns.displot with limiting the ranges