I have a pandas dataframe with a timestamp index, I am grouping to get only hourly values and after a series of operations on the values of that hour I need to re-write the results to the original DF:
for name, group in df.groupby(pd.Grouper(freq="1H")):
if group.shape[0] > 0:
results = some_function(group) # Operations on the group, returns a list of labels same length of the group
df.loc[group.index, 'results'] = results
I am getting the error ValueError: Must have equal len keys and value when setting with an iterable
but it only happens after many successful iterations (hours) in the for loop. Any ideas?
One possible problem should be duplicated index values, possible solution is avoid loops with assign in loc
:
df = pd.DataFrame({'value': [10, 20, 30, 40]},
index=pd.to_datetime([
"2021-01-01 00:00:00",
"2021-01-01 00:00:00",
"2021-01-01 00:30:00",
"2021-01-01 01:00:00"]))
#custom function
def some_function(x):
return range(len(x))
def helper(group):
group['results'] = some_function(group)
return group
out = df.groupby(pd.Grouper(freq="1h"), group_keys=False).apply(helper)
print(out)
value results
2021-01-01 00:00:00 10 0
2021-01-01 00:00:00 20 1
2021-01-01 00:30:00 30 2
2021-01-01 01:00:00 40 0
Another problem should be different length between length of groups and array/list returned from your custom function, here si solution for found this problematic data:
df = pd.DataFrame({'value': [10, 20, 30, 40]},
index=pd.to_datetime([
"2021-01-01 00:00:00",
"2021-01-01 00:00:00",
"2021-01-01 00:30:00",
"2021-01-01 01:00:00"]))
#simulate different lengths of lists returned from function
def some_function(x):
if len(x) == 1:
return range(len(x) * 2)
else:
return range(len(x))
def helper(group):
print (group)
print (f'Length of group is {len(group)}')
print (f'Length of output from function is {len(some_function(group))}')
group['results'] = some_function(group)
return group
out = df.groupby(pd.Grouper(freq="1h"), group_keys=False).apply(helper)
# print(out)
value
2021-01-01 00:00:00 10
2021-01-01 00:00:00 20
2021-01-01 00:30:00 30
Length of group is 3
Length of output from function is 3
value
2021-01-01 01:00:00 40
Length of group is 1
Length of output from function is 2