Search code examples
pythonpandasdatetimeif-statementconditional-statements

Add quantity to pandas column if row condition is met


I have this list of months [11, 2, 6, 10] and a dataframe that looks like this:

PredictionTargetDateEOM business_days
0       2022-06-30      22
1       2022-06-30      22
2       2022-06-30      22
3       2022-06-30      22
4       2022-06-30      22
        ... ... ...
172422  2022-11-30      21
172423  2022-11-30      21
172424  2022-11-30      21
172425  2022-11-30      21
172426  2022-11-30      21

I need to check each row to see if the month in PredictionTargetDateEOM is in the list, and if so, add 0.5 to the business_days column for that row. The output would look like this:

PredictionTargetDateEOM business_days
0       2022-06-30      22.5
1       2022-06-30      22.5
2       2022-06-30      22.5
3       2022-06-30      22.5
4       2022-06-30      22.5
        ... ... ...
172422  2022-11-30      21.5
172423  2022-11-30      21.5
172424  2022-11-30      21.5
172425  2022-11-30      21.5
172426  2022-11-30      21.5

Of course for rows in the dataframe with months not in the list, the number of business days for those rows should remain the same. I have tried these three methods to no avail:

for row in predicted_df['PredictionTargetDateEOM']:
    if row.month in months:
        predicted_df['business_days'] += 0.5

^This one just sums up all of the values in business_days. I'm not sure how to select just the right row in the if statement. Would I use pandas ix attribute?

predicted_df['business_days'] = df.apply(lambda x: x['business_days'] + 0.5 if x['PredictionTargetDateEOM'].month in months else x['business_days'], axis=1)

This one just gives me a blank dataframe.

predicted_df.loc[predicted_df['PredictionTargetDateEOM'].month in months, 'final_business_days'] = #predicted_df['business_days']+0.5

The commented out section is because I don't know how to properly make this logic work, or if it is the right approach. I know this is how to set a new value, I don't know if I can use this logic to update an existing value.


Solution

  • as long as your dates are datetime objects, you can use .dt.month.isin()

    import pandas as pd
    df = pd.DataFrame({'PredictionTargetDateEOM': ['2022-06-30', '2022-06-30'],
     'business_days': [22, 22]})
    
    df['PredictionTargetDateEOM'] = pd.to_datetime(df['PredictionTargetDateEOM'])
    
    df.loc[df['PredictionTargetDateEOM'].dt.month.isin([11, 2, 6, 10]), 'business_days'] +=.5
    
    print(df)
    

    Output

      PredictionTargetDateEOM  business_days
    0              2022-06-30           22.5
    1              2022-06-30           22.5