Search code examples
pythonarrayspandasdataframeapply

Replace value based on a corresponding value but keep value if criteria not met


Given the following dataframe,

INPUT df:

Cost_centre Pool_costs
90272 A
92705 A
98754 A
91350 A

Replace Pool_costs value with 'B' given the Cost_centre value but keep the Pool_costs value if the Cost_centre value does not appear in list.

OUTPUT df:

Cost_centre Pool_costs
90272 B
92705 A
98754 A
91350 B

Current Code:

This code works up until the else side of lambda; finding the Pool_costs value again is the hard part.

df = pd.DataFrame({'Cost_centre': [90272, 92705, 98754, 91350],
                   'Pool_costs': ['A', 'A', 'A', 'A']})

pool_cc = ([90272,91350])

pool_cc_set = set(pool_cc)

df['Pool_costs'] = df['Cost_centre'].apply(lambda x: 'B' if x in pool_cc_set else df['Pool_costs'])
print (df)

I have used the following and have found success but it gets hard to read and modify when there are a lot of cost_centre's to change.

df = pd.DataFrame({'Cost_centre': [90272, 92705, 98754, 91350],
                   'Pool_costs': ['A', 'A', 'A', 'A']})

filt = df['Cost_centre'] == '90272'|df['Cost_centre'] == '91350')
df.loc[filt, 'Pool_costs'] = 'B'

Solution

  • IIUC, you can use isin

    filt = df['Cost_centre'].isin([90272, 91350])
    df.loc[filt, 'Pool_costs'] = 'B'
    
    print(df)
    
       Cost_centre Pool_costs
    0        90272          B
    1        92705          A
    2        98754          A
    3        91350          B