With the following code I can visualize data with a histogram and its kernel density estimation (kde).
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/diamonds.csv')
plt.subplots(figsize=(7,6), dpi=100)
#sns.distplot( df.loc[df.species=='setosa', "sepal_length"] , color="dodgerblue", label="Setosa")
#sns.distplot( df.loc[df.species=='virginica', "sepal_length"] , color="orange", label="virginica")
#sns.distplot( df.loc[df.species=='versicolor', "sepal_length"] , color="deeppink", label="versicolor")
sns.histplot(df.loc[df.cut=='Good', 'depth'], bins= 100, color="blue", label="pcts", kde = True)
plt.title('Iris Histogram')
plt.legend();
The resulting graph output is as follows:
How can I adjust the kernel smoothing line graph so that the curve does not run so smoothly but follows more the shape of the histogram?
The result should be something more like the red curve in this plot:
Or am I just mixing things up and the curve I am looking for corresponds to a probability density function (pdf)?
You can pass sns.kdeplot()
keywords to sns.histplot
as a dictionary. To change the smoothness of the KDE plot, you want to play with the bw_adjust
parameter.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
random_data = np.random.normal(0, 0.1, 1000)
fig, ax = plt.subplots(nrows=2)
sns.histplot(random_data, bins=100, kde = True, kde_kws={"bw_adjust":0.25}, ax=ax[0])
sns.histplot(random_data, bins=100, kde = True, kde_kws={"bw_adjust":5}, ax=ax[1])