I would like to plot a histplot but using points rather than bars.
x_n10_p0_6 = binom.rvs(n=10, p=0.6, size=10000, random_state=0)
x_n10_p0_8 = binom.rvs(n=10, p=0.8, size=10000, random_state=0)
x_n20_p0_8 = binom.rvs(n=20, p=0.6, size=10000, random_state=0)
df = pd.DataFrame({
'x_n10_p0_6': x_n10_p0_6,
'x_n10_p0_8': x_n10_p0_8,
'x_n20_p0_8': x_n20_p0_8
})
sns.histplot(df)
This is what I'm getting:
I would like to see something like this:
Source: https://en.wikipedia.org/wiki/Binomial_distribution#/media/File:Binomial_distribution_pmf.svg
There is an element attribute to histplot but it only takes the values {“bars”, “step”, “poly”}
You are working with discrete distributions. A kde plot, on the contrary, tries to approximate a continuous distribution by smoothing out the input values. As such, a kdeplot with your discrete values only gives a crude approximation of the plot you seem to be after.
Seaborn's histplot
currently only implements bars for discrete distributions. However, you can mimic such a plot via matplotlib. Here is an example:
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
from scipy.stats import binom
import pandas as pd
import numpy as np
x_n10_p0_6 = binom.rvs(n=10, p=0.6, size=10000, random_state=0)
x_n10_p0_8 = binom.rvs(n=10, p=0.8, size=10000, random_state=0)
x_n20_p0_8 = binom.rvs(n=20, p=0.6, size=10000, random_state=0)
df = pd.DataFrame({
'x_n10_p0_6': x_n10_p0_6,
'x_n10_p0_8': x_n10_p0_8,
'x_n20_p0_8': x_n20_p0_8
})
for col in df.columns:
xmin = df[col].min()
xmax = df[col].max()
counts, _ = np.histogram(df[col], bins=np.arange(xmin - 0.5, xmax + 1, 1))
plt.scatter(range(xmin, xmax + 1), counts, label=col)
plt.legend()
plt.gca().xaxis.set_major_locator(MaxNLocator(integer=True)) # force integer ticks for discrete x-axis
plt.ylim(ymin=0)
plt.show()
Note that seaborn's histplot
has many more options than shown in this example (e.g. scaling the counts down to densities).