I have data somewhat like this, presenting Net Cash Flow per Portfolio, and on what dates:
import datetime
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame({'PORTFOLIO': ['A', 'A', 'A', 'A','A', 'A', 'A', 'B','B', 'B','B', 'B', 'B', 'B','C'],
'DATE': ['28-02-2018','28-02-2018','28-02-2018','10-10-2018','10-10-2018','01-12-2018','31-12-2018',
'30-09-2018','30-09-2018','30-09-2018','31-12-2018','31-01-2019','28-02-2019','05-03-2019','01-07-2019'],
'NCF': [ 856000, 900000, 45000, 2005600,43900, 46700, 900000, 7890000, 821000, 95000, 400000, 7000000, 82500,10000000,1525000],
})
df2=df.groupby(['PORTFOLIO','DATE']).sum().reset_index()
df2
I group it as I am only interested in seeing the cash flows per days.
Now I am interested in visualizing the Cash Flow in a bar chart per portfolio.
sns.set(style='dark', color_codes=True)
g=sns.FacetGrid(df2, col="PORTFOLIO", hue='PORTFOLIO',col_wrap=3, height=5, sharey=False, sharex=False)
g=g.map(plt.bar,'DATE','NCF')
g.set_xticklabels(rotation=45)
plt.tight_layout()
plt.show()
Unfortunately, the seaborn facetgrid multiplots gives me incorrect values on the x axis, no matter what I try to do with the dataset. It is like the first portfolio sets the tick-values, and the rest just has to follow even thoug the dates are incorrect.
If I remove
g.set_xticklabels(rotation=45)
Then portfolio C gets the correct date, and it seems like the correct dates on B are hidden behind the incorrect 'A'-dates.
The order of the bins change, but still not correct (monotonic increasing by date).
What am I doing wrong, and how can I fix this?
First convert to datetime
and sort:
df2.DATE = pd.to_datetime(df2.DATE)
df2 = df2.sort_values(by=['PORTFOLIO', 'DATE'])
df2.DATE = df2.DATE.astype(str)
You can access the individual axes with g.axes
(based on this answer). So:
sns.set(style='dark', color_codes=True)
g=sns.FacetGrid(df2, col="PORTFOLIO", hue='PORTFOLIO',col_wrap=3, height=5, sharey=False, sharex=False)
g=g.map(plt.bar,'DATE','NCF')
g.set_xticklabels(rotation=45)
for idx, v in enumerate(df2.PORTFOLIO.unique()):
g.axes[idx].set_xticklabels(df2.loc[df2.PORTFOLIO == v, 'DATE'])
plt.tight_layout()
plt.show()
Gives you: