Search code examples
pythonmatplotlibscipyseaborn

How can I change the distribution curve (kde) smoothing of a histogram?


With the following code I can visualize data with a histogram and its kernel density estimation (kde).

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/diamonds.csv')


plt.subplots(figsize=(7,6), dpi=100)
#sns.distplot( df.loc[df.species=='setosa', "sepal_length"] , color="dodgerblue", label="Setosa")
#sns.distplot( df.loc[df.species=='virginica', "sepal_length"] , color="orange", label="virginica")
#sns.distplot( df.loc[df.species=='versicolor', "sepal_length"] , color="deeppink", label="versicolor")

sns.histplot(df.loc[df.cut=='Good', 'depth'], bins= 100, color="blue", label="pcts", kde = True)

plt.title('Iris Histogram')
plt.legend();

The resulting graph output is as follows:

enter image description here

How can I adjust the kernel smoothing line graph so that the curve does not run so smoothly but follows more the shape of the histogram?

The result should be something more like the red curve in this plot: enter image description here

Or am I just mixing things up and the curve I am looking for corresponds to a probability density function (pdf)?


Solution

  • You can pass sns.kdeplot() keywords to sns.histplot as a dictionary. To change the smoothness of the KDE plot, you want to play with the bw_adjust parameter.

    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    
    random_data = np.random.normal(0, 0.1, 1000)
    
    fig, ax = plt.subplots(nrows=2)
    sns.histplot(random_data, bins=100, kde = True, kde_kws={"bw_adjust":0.25}, ax=ax[0])
    sns.histplot(random_data, bins=100, kde = True, kde_kws={"bw_adjust":5}, ax=ax[1])
    

    enter image description here