Search code examples
pythonpandasdataframemulti-index

how to apply condition in pandas but only for finite number of rows?


I have a multi indexed pandas table as below.
enter image description here

I want to update Crop and Avl column, say with 'Tomato', and '0', but only for finite no of times (say, I need only 10 rows for Tomato, satisfying a condition). Currently via pandas I end up updating all rows that satisfy that condition.

col1 = ildf1.index.get_level_values(1) # https://stackoverflow.com/a/50608928/9148067
cond = col1.str.contains('DN_Mega') & (ildf1['Avl'] == 1)

ildf1.iloc[ cond , [0,2]] = ['Tomato', 0]

How do I restrict it to only say 10 rows of all rows that satisfy the condition?

PS: I used get_level_values as I have 4 columns (GR, PP+MT, Bay, Row) multi indexed in my df.


Solution

  • For df defined as below, you need to add additional index to numerate all rows with different number, then you can set new values based on slice. Here you go =^..^=

    import pandas as pd
    
    
    df = pd.DataFrame({'Crop': ['', '', '', '', ''], 'IPR': ['', '', '', '', ''], 'Avi': [1, 2, 3, 4, 5]}, index=[['0','0', '8', '8', '8'], ['6', '7', '7', '7', '7']])
    
    # add additional index
    df['id'] = [x for x in range(0, df.shape[0])]
    df.set_index('id', append=True, inplace=True)
    
    # select only two values based on condition
    condition = df.index[df.index.get_level_values(0).str.contains('8')][0:2]
    df.loc[condition, ['Crop', 'IPR']] = ['Tomato', 0]
    

    Output:

              Crop IPR  Avi
        id                 
    0 6 0                 1
      7 1                 2
    8 7 2   Tomato   0    3
        3   Tomato   0    4
        4                 5