In pd.groupby()
using a pd.Grouper()
along with a column fruit
is showing less number of groups shown in #4
below. There should have been fruit
with other dates
because those are there in the final output in #5
.
For example there is a (2020-01-01 00:00:00, 'mango')
group but no (2020-01-01 00:00:00, 'orange')
group in #4
etc. May be I am missing something. Thank you for your help.
Code is below:
# Library
import pandas as pd
# Data
date = [pd.Timestamp('01/01/2020'),
pd.Timestamp('01/03/2020'),
pd.Timestamp('01/20/2020'),
pd.Timestamp('09/01/2020'),
pd.Timestamp('09/03/2020'),
pd.Timestamp('09/20/2020'),
pd.Timestamp('12/01/2020'),
pd.Timestamp('12/03/2020'),
pd.Timestamp('12/20/2020')
]
df = pd.DataFrame({
'fruits': ['mango','mango','orange','orange','banana', 'mango', 'orange','banana', 'banana'],
'price': [10,12,7,9,3,1,2,11,13],
'date': date
})
# Grouper
# 1MS: month start frequency
p = pd.Grouper(freq='1MS', key='date')
print("#1-\n", p, '\n')
g = df.groupby(['fruits'])
print("#2-\n", g.groups, '\n')
g = df.groupby([p])
print("#3-\n", g.groups, '\n')
g = df.groupby([p, 'fruits'])
print("#4-\n", g.groups, '\n')
result = g.sum()
print("\n\n#5- result:\n", result)
Output:
#1-
TimeGrouper(key='date', freq=<MonthBegin>, axis=0, sort=True, dropna=True, closed='left', label='left', how='mean', convention='e', origin='start_day')
#2-
{'banana': [4, 7, 8], 'mango': [0, 1, 5], 'orange': [2, 3, 6]}
#3-
{2020-01-01 00:00:00: [0, 1, 2], 2020-02-01 00:00:00: [], 2020-03-01 00:00:00: [], 2020-04-01 00:00:00: [], 2020-05-01 00:00:00: [], 2020-06-01 00:00:00: [], 2020-07-01 00:00:00: [], 2020-08-01 00:00:00: [], 2020-09-01 00:00:00: [3, 4, 5], 2020-10-01 00:00:00: [], 2020-11-01 00:00:00: [], 2020-12-01 00:00:00: [6, 7, 8]}
#4-
{(2020-01-01 00:00:00, 'mango'): [0], (2020-09-01 00:00:00, 'mango'): [1], (2020-12-01 00:00:00, 'orange'): [2]}
#5- result:
price
date fruits
2020-01-01 mango 22
orange 7
2020-09-01 banana 3
mango 1
orange 9
2020-12-01 banana 24
orange 2
You found bug, already reported - BUG: pd.Grouper with a datetime key in conjunction with another key generates incorrect number of group keys. #51158