Search code examples
pythonpandasmax

Python Pandas: Going through a list of cycles and making point of interest


To explain my problem easier I have created a dataset:

data = {'Cycle': ['Set1', 'Set1', 'Set1', 'Set2', 'Set2', 'Set2', 'Set2'],
        'Value': [1, 2.2, .5, .2,1,2.5,1]}

I want to create a loop that goes through the "Cycle" column and marks the max of each cycle with the letter A and the min with letter B, to output something like this:

POI = {'Cycle': ['Set1', 'Set1', 'Set1', 'Set2', 'Set2', 'Set2', 'Set2'],
        'Value': [1, 2.2, .5, .2,1,2.5,1],
         'POI': [0, 'A','B','B',0,'A',0]}

df2 = pd.DataFrame(POI)

I am new to Python, so as much detail as possible would be very helpful, as well as I am not exactly sure how to go through each cycle on its own to get these values, so explaining that would be great.

Thanks


Solution

  • Using numpy.select and groupby.transform:

    g = df.groupby('Cycle')['Value']
    df['POI'] = np.select([df['Value'].eq(g.transform('max')),
                           df['Value'].eq(g.transform('min'))],
                          ['A', 'B'])
    
    # if you want 0 as default value (not '0')
    df['POI'] = df['POI'].replace('0', 0)
    

    output:

      Cycle  Value POI
    0  Set1    1.0   0
    1  Set1    2.2   A
    2  Set1    0.5   B
    3  Set2    0.2   B
    4  Set2    1.0   0
    5  Set2    2.5   A
    6  Set2    1.0   0