I got a question on how to organize my data to make a barplot with seaborn. My data input looks like the following
influencer platform reach
person a instagram 10000
person b instagram 5000
person a instagram 8000
person a youtube 13000
What I need is a barplot that shows the sum of the reach per category per influencer.
Basically I'm getting quite close with seaborn, but it shows the mean of the reach column, not the total
sns.barplot(x='influencer', y='reach', data=df, hue='platform')
I tried arranging my inputs with a crosstab
pd.crosstab(df['influencer'], df['platform'])
but instead the totals I get the sum of entries.
Feels like I'm close, but I can't get it fully done. Could you please help me?
Instead of changing the data, use the estimator
parameter to change how sns.barplot
aggregates the categories:
estimator
: Statistical function to estimate within each categorical bin.
sns.barplot(data=df, x='influencer', y='reach', hue='platform',
estimator=sum, ci=None)
Or if you still want to compute the sums manually, use groupby.sum
:
sns.barplot(data=df.groupby(['influencer', 'platform'], as_index=False)['reach'].sum(),
x='influencer', y='reach', hue='platform')