I'd like to show a density plot for many samples. Each sample belongs to a particular grouping variable. I can plot each individual density plot like so:
import seaborn as sns
fmri = sns.load_dataset("fmri")[['subject','timepoint','region','signal']].drop_duplicates(['subject','timepoint','region'])
region2col={'parietal':'red', 'frontal':'blue'}
fig, ax= plt.subplots(figsize=(22,10))
for subject in fmri.subject.unique():
temp=fmri.loc[fmri.subject==subject,]
for region in temp['region'].unique():
temp2=temp.loc[temp.region==region,]
sns.distplot(
temp2['signal'],
label = region,
color=region2col[region],
kde=True, hist=False,
ax=ax
)
However, I'd like to draw instead an overall density of the distribution of each region (same axes as above, signal
and density
) but with a shaded area for extremes (maximum and minimum at each signal point) and an overall fitting line describing the general trend. Similar to this:
#example only to show formatting wanted.
# XX axis should show "signal"
# YY axis should show density
g = sns.relplot(x="timepoint", y="signal",
hue="region",
kind="line", data=fmri)
plt.show()
Is this possible?
This is probably not the fastest method, but you could calculate the kde for each subject/region over a certain range, and then let lineplot
do the rest
from scipy.stats import gaussian_kde
x = np.linspace(fmri['signal'].min(),fmri['signal'].max(),100)
temp = fmri.groupby(['subject','region'])['signal'].apply(lambda temp: pd.Series(gaussian_kde(temp).evaluate(x), index=pd.Index(x, name='x')))
temp = temp.reset_index(name='kde')
plt.figure()
sns.lineplot(data=temp, x='x', y='kde', hue='region')