python python-3.x pandas group-by higher-order-functions

Group rows partially [Python] [Pandas]

Good morning everyone.

I have the following data:

import pandas as pd

info = {
'states': [-1, -1, -1, 1, 1, -1, 0, 1, 1, 1],
'values': [34, 29, 28, 30, 35, 33, 33, 36, 40, 41] }

df = pd.DataFrame(data=info)

print(df)

>>> 
    states   values
0       -1       34
1       -1       29
2       -1       28
3        1       30
4        1       35
5       -1       33
6        0       33
7        1       36
8        1       40
9        1       41

I need to group the data using PANDAS (and/or higher order functions) (already did the exercise using for loops), I need to group the data having the "states" column as a guide. But the grouping should not be of all the data, I only need to group the data that is neighboring... as follows:

Initial DataFrame:

    states   values
0       -1       34 ┐
1       -1       29 │    Group this part (states = -1)
2       -1       28 ┘
3        1       30 ┐    Group this part (states =  1)
4        1       35 ┘
5       -1       33     'Group' this part (states = -1)
6        0       33     'Group' this part (states =  0)
7        1       36 ┐
8        1       40 │    Group this part (states =  1)
9        1       41 ┘

It would result in a DataFrame, with a grouping by segments (from the "states" column) and in another column the sum of the data (from the "values" column).

Expected DataFrame:

    states   values
0       -1       91     (values=34+29+28)
1        1       65     (values=30+35)
2       -1       33
3        0       33
4        1       117    (values=36+40+41)

You who are more versed in these issues, perhaps you can help me perform this operation.

Thank you so much!

Solution

Identify the blocks/groups of rows using diff and cumsum then group the dataframe by these blocks and aggregate states with first and values with sum

b = df['states'].diff().ne(0).cumsum()
df.groupby(b).agg({'states': 'first', 'values': 'sum'})

Result

        states  values
states                
1           -1      91
2            1      65
3           -1      33
4            0      33
5            1     117