I'm well and truly stumped on this
I have a MultiIndex dataframe that looks like this
data
index1 index2
0 1 8
2 7
3 6
4 9
1 1 3
2 4
3 3
4 6
2 1 5
2 5
.... and so on
and I'm trying to sum a load of values from the data column for each index1 based on a range of values from index2 to create a new dataframe.
i.e. if I were to create a new dataframe from the data values that correspond to the first 2 values of index2 per index1 from the example above I would want to get,
index1 summed_data
0 15
1 7
2 10
Does anyone know how to do this?
You don't need to change your input format, using the following statement:
x = df.groupby(level ='index1').agg({'data': lambda x: x[:2].sum()}).rename(columns = {'data':'summed_data'})
Then print:
summed_data
index1
0 15
1 7
2 10