I have some data i have placed into a pandas
dataframe
, and I plotted a bar plot of the unique value counts for a particular column
.
I would like to control the bandwidth of the Pandas built-in df.plot.density()
Function, which plots the kde over the data. Is this possible, or am I better off with Sklearn, Scipy, or something else?
Thanks
As pointed out by @Jan, you could use seaborn
for this, it's pretty easy to control the bandwidth on a kde
plot. Here is an example with random normal data:
import seaborn as sns
d = pd.DataFrame({'x':np.random.choice(['a','b','c'], 100), 'y':np.random.randn(100)})
fig, axes = plt.subplots(1,3)
for name,g in d.groupby('x'):
g['y'].plot.density(ax=axes[0], label=name)
sns.kdeplot(g['y'], bw=0.25, ax=axes[1], label=name)
sns.kdeplot(g['y'], bw=0.75, ax=axes[2], label=name)
axes[0].set_title('pandas plot.density', fontsize='12')
axes[1].set_title('seaborn kde with \n 0.25 bandwidth', fontsize='12')
axes[2].set_title('seaborn kde with \n 0.75 bandwidth', fontsize='12')
plt.legend()
This returns the following plot to compare: