Suppose I have the following df
:
df
user start end mode
0 82 2009-04-19 05:49:50 2009-04-19 06:17:40 metro
1 82 2009-04-19 06:18:05 2009-04-19 06:22:44 foot
2 10 2007-06-26 11:32:29 2007-06-26 11:40:29 bus
3 10 2008-03-28 14:52:54 2008-03-28 15:59:59 metro
4 20 2011-08-27 06:13:01 2011-08-27 08:01:37 foot
5 20 2012-02-20 14:10:33 2012-02-20 14:29:59 bus
6 20 2012-02-21 01:22:05 2012-02-21 01:55:47 bus
So I want group my df by mode
column and plot histogram for (year, month)
showing how many of each mode are there in that month of that year.
So I do:
df[['mode']].groupby([df['start'].dt.year, df['start'].dt.month]).count().plot(kind='barh')
The result is cumulative value of mode
for each month.
But I want to see the number of each mode
(where available) in every month of each year.
You could use seaborn and pass the hue
paramter.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'user': [82, 82, 10, 10, 20, 20, 20],
'start': ['2009-04-19 05:49:50',
'2009-04-19 06:18:05',
'2007-06-26 11:32:29',
'2008-03-28 14:52:54',
'2011-08-27 06:13:01',
'2012-02-20 14:10:33',
'2012-02-21 01:22:05'],
'end': ['2009-04-19 06:17:40',
'2009-04-19 06:22:44',
'2007-06-26 11:40:29',
'2008-03-28 15:59:59',
'2011-08-27 08:01:37',
'2012-02-20 14:29:59',
'2012-02-21 01:55:47'],
'mode': ['metro', 'foot', 'bus', 'metro', 'foot', 'bus', 'bus']})
df['start'] = pd.to_datetime(df['start'])
df['end'] = pd.to_datetime(df['end'])
df['year'] = df.start.dt.year
df['month'] = df.start.dt.month
df = df.groupby(['year','month','mode']).size().reset_index().rename(columns={0:'count'})
g = sns.barplot(data=df, y=df['year'].astype(str)+','+df['month'].astype(str),x='count', hue='mode', orient='h')
plt.ylim(reversed(plt.ylim()));