How can I can plot multiple stacked histograms using Seaborn? I tried the following code, but it threw a dimensions error: ValueError: Length of list vectors must match length of data...
df = pd.DataFrame({'id': [1,2,3,4,5,6,7,8,9,10],
'val1': ['a','b',np.nan,np.nan,'a','a',np.nan,np.nan,np.nan,'b'],
'val2': [7,0.2,5,8,np.nan,1,0,np.nan,1,1],
'cat': ['yes','no','no','no','yes','yes','yes','yes','no','yes'],
})
display(df)
sns.histplot(data=df, y=['val1', 'val2'], hue='cat', multiple='stack')
Desired Plot:
val1 "no" freq = 1 and "yes" = 4
val2 "no" freq = 4 and "yes" = 4
I don't think you'll be able to do this directly from your current data frame. You need to get a dataframe that has val1/val2 in one column and yes/no in another.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({'id': [1,2,3,4,5,6,7,8,9,10],
'val1': ['a','b',np.nan,np.nan,'a','a',np.nan,np.nan,np.nan,'b'],
'val2': [7,0.2,5,8,np.nan,1,0,np.nan,1,1],
'cat': ['yes','no','no','no','yes','yes','yes','yes','no','yes'],
})
val1 = df[['cat', 'val1']].dropna().drop(columns='val1')
val1['val'] = 'val1'
val2 = df[['cat', 'val2']].dropna().drop(columns='val2')
val2['val'] = 'val2'
plot_df = val1.append(val2).sort_values(by='cat')
sns.histplot(data=plot_df,x='val', stat='count', hue='cat', multiple='stack')
plt.show()