I have some data:
df = pd.DataFrame({
'Plan': [40, 50, 60, 25],
'Fact': [10, 20, 30, 15],
'financing_type': ['type_1', 'type_2', 'type_1', 'type_3']
})
And I need to plot two bars with different colors depend on sum for financing_type
Exactly like this:
I did it by this way:
df_type_1 = df[df['financing_type'] == 'type_1']
df_type_2 = df[df['financing_type'] == 'type_2']
df_type_3 = df[df['financing_type'] == 'type_3']
plt.bar(['Plan', 'Fact'], [df_type_1['Plan'].sum(), df_type_1['Fact'].sum()], color='blue', label='type_1')
plt.bar(
['Plan', 'Fact'],
[df_type_2['Plan'].sum(), df_type_2['Fact'].sum()],
bottom=[df_type_1['Plan'].sum(), df_type_1['Fact'].sum()],
color='red',
label='type_2',
)
plt.bar(
['Plan', 'Fact'],
[df_type_3['Plan'].sum(), df_type_3['Fact'].sum()],
bottom=[df_type_1['Plan'].sum() + df_type_2['Plan'].sum(), df_type_1['Fact'].sum() + df_type_2['Fact'].sum()],
color='green',
label='type_3',
)
plt.legend()
plt.show()
How can I do it for the more common case? If I don't know how many different types in the column financing_type.
Here is an approach:
pivot_table
, summing the values for each typeimport pandas as pd
# Given a dataframe
df = pd.DataFrame({
'Plan': [40, 50, 60, 25],
'Fact': [10, 20, 30, 15],
'financing_type': ['type_1', 'type_2', 'type_1', 'type_3']})
# Melt the DataFrame
df_melted = df.melt(id_vars=['financing_type'], var_name='Category', value_name='Value')
# Pivot the dataFrame to get the sum of 'Plan' and 'Fact' for each 'financing_type'
df_pivot = df_melted.pivot_table(index='Category', columns='financing_type', values='Value', aggfunc='sum')
# Reorder the index of the pivoted dataframe
df_pivot = df_pivot.reindex(['Plan', 'Fact'])
# Create a stacked bar plot
df_pivot.plot.bar(stacked=True, rot=0, xlabel='')
Alternatively, you can use seaborn to create a stacked, weighted histogram:
import seaborn as sns
import pandas as pd
# Given a dataframe
df = pd.DataFrame({
'Plan': [40, 50, 60, 25],
'Fact': [10, 20, 30, 15],
'financing_type': ['type_1', 'type_2', 'type_1', 'type_3']})
# Melt the DataFrame
df_melted = df.melt(id_vars=['financing_type'], var_name='Category', value_name='Value')
# Create a stacked, weighted histogram
sns.histplot(df_melted, x='Category', hue='financing_type', weights='Value', multiple='stack', alpha=1)