Search code examples
pythonpandasdataframegroup-by

Group and multiply columns with conditions


I'm trying to multiply 2 columns until get a desired value(8), but i need to group first, also need to keep the first mult if the values is already under the desired valued (This part is the problematic)

MPRO ID Nuevo_I Nuevo_P
1 ID1 5 3
1 ID1 2 3
1 ID1 1 3
1 ID2 2 3
1 ID2 1 3
1 ID3 3 2
2 ID1 3 4
2 ID2 3 2
2 ID2 3 1
2 ID2 2 1

Desired Output:

Macroproceso ID Nuevo_I Nuevo_P F
1 ID1 5 3 15
1 ID1 2 3 6
1 ID1 1 3 DONE
1 ID2 2 3 6
1 ID2 1 3 DONE
1 ID3 3 2 6
2 ID1 3 4 12
2 ID2 3 2 6
2 ID2 3 1 DONE
2 ID2 2 1 DONE

I have tried with this code:

A = (
        df['Nuevo_I'].mul(df['Nuevo_P'])
        .groupby([df['MPRO'], df['ID']])
    )

B = A.le(8)..groupby([df['MPRO'], df['ID']]).shift(fill_value=False)

Hope anyone could give me hand, thanks!


Solution

  • You can use a groupby.apply:

    # compute the product
    s = df[['Nuevo_I', 'Nuevo_P']].prod(axis=1)
    
    # identify values after the first value < 8
    m = (s.le(8).groupby([df['MPRO'], df['ID']], group_keys=False)
          .apply(lambda x: x.shift(fill_value=False).cummax())
        )
    
    # mask
    df['F'] = s.mask(m, 'DONE')
    

    Output:

       MPRO   ID  Nuevo_I  Nuevo_P     F
    0     1  ID1        5        3    15
    1     1  ID1        2        3     6
    2     1  ID1        1        3  DONE
    3     1  ID2        2        3     6
    4     1  ID2        1        3  DONE
    5     1  ID3        3        2     6
    6     2  ID1        3        4    12
    7     2  ID2        3        2     6
    8     2  ID2        3        1  DONE
    9     2  ID3        2        1     2