Search code examples
pythonpandasgroup-bymaxintervals

Find the highest value locations within an interval and for a specific index?


Given this pandas dataframe with three columns, 'room_id', 'temperature' and 'State'. How do I get a forth column 'Max' indicating wehn the value is a maximum for each interval where State is True and for each room ?

117     1.489000  True
        8.9       False
        2.5       False  
        4.370000  False  
        4.363333  True
        4.356667  True
118     4.35      True
        6.648000  True
        6.642667  True
        7.3       False
        9.4       False
        5.3       True
        7.1       True

What I am expecting

117     1.489000  True   max
        8.9       False
        2.5       False  
        4.370000  False  
        4.363333  True   max
        4.356667  True
118     4.35      True
        6.648000  True   max
        6.642667  True
        7.3       False
        9.4       False
        5.3       True
        7.1       True   max 

I used this : 

Max = df_state.groupby(masque.cumsum()[~masque])['temperature'].agg(['idxmax'])

But I found this : 

117 1.489000 True max 8.9 False 2.5 False
4.370000 False
4.363333 True
4.356667 True 118 4.35 True 6.648000 True max 6.642667 True 7.3 False 9.4 False 5.3 True 7.1 True max


I miss the last max of room 117 because the algorithm does not take into account the room id

Solution

  • You can use groupby.idxmax to get the index of the max per custom group:

    # get the indices of the max value per group
    idx = (df[df['State']].groupby(['room_id', (~df['State']).cumsum()])
           ['temperature'].idxmax()
           )
    
    # assign the new value
    df.loc[idx, 'max_temp'] = 'max'
    

    If you want the temperature value instead of a literax 'max':

    df.loc[idx, 'max'] = df.loc[idx, 'temperature']
    

    Output:

        room_id  temperature  State  max  max_temp
    0       117     1.489000   True  max  1.489000
    1       117     8.900000  False  NaN       NaN
    2       117     2.500000  False  NaN       NaN
    3       117     4.370000  False  NaN       NaN
    4       117     4.363333   True  max  4.363333
    5       117     4.356667   True  NaN       NaN
    6       118     4.350000   True  NaN       NaN
    7       118     6.648000   True  max  6.648000
    8       118     6.642667   True  NaN       NaN
    9       118     7.300000  False  NaN       NaN
    10      118     9.400000  False  NaN       NaN
    11      118     5.300000   True  NaN       NaN
    12      118     7.100000   True  max  7.100000