Search code examples
pythonpandasdatetimeconditional-formatting

Add days to date conditionally using Pandas


I have a table that I need add days to and create a new column with that information. The problem I am having is that there are two date calculations based on a different column. Here is a similar table to the one I am working with:

Type    Name    Date
A       Abe     6/2/2021
B       Joe     6/15/2021
A       Jin     6/25/2021
A       Jen     6/1/2021
B       Pan     6/21/2021
B       Pin     6/22/2021
B       Hon     6/11/2021
A       Hen     6/23/2021
A       Bin     6/23/2021
A       Ban     6/5/2021

I am trying to get the table to return like this where Type A goes up by 7 days and Type B goes up by 2 business days:

Type    Name    Date        NewDate
A       Abe     6/2/2021    6/9/2021
B       Joe     6/15/2021   6/19/2021
A       Jin     6/25/2021   7/2/2021
A       Jen     6/1/2021    6/8/2021
B       Pan     6/21/2021   6/23/2021
B       Pin     6/22/2021   6/26/2021
B       Hon     6/11/2021   6/13/2021
A       Hen     6/23/2021   6/30/2021
A       Bin     6/23/2021   6/30/2021
A       Ban     6/5/2021    6/12/2021

So far I have tried these:

import pandas as pd
from pandas.tseries.offsets import BDay
from datetime import datetime, timedelta
df1['NewDate'] = df1.apply(df1['Date'] + timedelta(days=7) 
                 if x=='Emergency' else df1['Date'] + BDay(2) for x in df1['Type'])

Don't run that, either you will go in an infinite loop or it will take a very long time.

I've also run this:

df1['NewDate'] = [df1['Date'] + timedelta(days=7) if i=='Emergency' else df1['Date'] + BDay(2)
                 for i in df1.Type] (also tried with df1[Type] same results.

This puts all the rows in a single row (almost looks like how it returns on jupyter notebook with the ...)

I have also tried this:

df1['NewDate'] = df1['Type'].apply(lambda x: df1['Date'] + timedelta(days=7) if x=='Emergency'
                                   else df1['Date'] + BDay(2))

When I run that one it will go through each row on the type and apply the correct logic on the if emergency calculate by 7 days and if not calculate by business day, the problem is that every row returned is calculated on the first row of the entire table.

At this point I am a little lost, any help would be greatly appreciated. For simplicity sakes it can be calculated at plus timedelta(7) and plus timedelta(2). Also what would change if I had to add more conditions like say on Name column.


Solution

  • To use apply, try:

    df["Date"] = pd.to_datetime(df["Date"])
    df["NewDate"] = df.apply(lambda x: x["Date"]+BDay(2) if x["Type"]=="B" else x["Date"]+pd.DateOffset(days=7), axis=1)
    
    >>> df
      Type Name       Date    NewDate
    0    A  Abe 2021-06-02 2021-06-09
    1    B  Joe 2021-06-15 2021-06-17
    2    A  Jin 2021-06-25 2021-07-02
    3    A  Jen 2021-06-01 2021-06-08
    4    B  Pan 2021-06-21 2021-06-23
    5    B  Pin 2021-06-22 2021-06-24
    6    B  Hon 2021-06-11 2021-06-15
    7    A  Hen 2021-06-23 2021-06-30
    8    A  Bin 2021-06-23 2021-06-30
    9    A  Ban 2021-06-05 2021-06-12
    

    Alternatively, you can use numpy.where:

    import numpy as np
    df["NewDate"] = np.where(df["Type"]=="B", df["Date"]+BDay(2), df["Date"]+pd.DateOffset(7))