Search code examples
pandasrelativedelta

relativedelta - Add # of months based on column values


I would like to add "exact months" to a start date to a start date in order to calculate an end date and have been playing with relativedelta. I have the following:

df1['Start Date'] = ['11/1/2018', '3/15/2019', NaN, '5/15/2019', '2/28/2017', NaN, '10/31/2018']

df1['Months'] = [12.0, 36.0, 15.0, 36.0, 12.0, 9.0, 5.0]

df1['Start Date'] is currently datetime64[ns] while df1['Months'] is float64.

The end result should be df1['Start Date'] + .df1['Months'] - 1 day but the relativedelta aspect is important as I'd like to return "exact months." NaN can continue to return NaN.

Here was my attempted calc:

df1['End_Date'] = df1['Effective_Date'].apply(lambda x: x + relativedelta(months = df1['Months'].astype(float))) - pd.DateOffset(days = 1)

I get the following error that I'm not sure how to resolve:

cannot convert the series to <class 'int'>

I tried the following to no avail:

df1['Months'].astype('timedelta64[D]')

Really appreciate your help.


Solution

  • First convert column to datetimes, add months by DataFrame.apply per rows by axis=1 and last subtract one day:

    df1['Start Date'] = pd.to_datetime(df1['Start Date'])
    
    f = lambda x: x['Start Date'] + relativedelta(months = int(x['Months']))
    df1['End_Date'] = df1.apply(f, axis=1) - pd.DateOffset(days = 1)
    print (df1)
      Start Date  Months   End_Date
    0 2018-11-01    12.0 2019-10-31
    1 2019-03-15    36.0 2022-03-14
    2        NaT    15.0        NaT
    3 2019-05-15    36.0 2022-05-14
    4 2017-02-28    12.0 2018-02-27
    5        NaT     9.0        NaT
    6 2018-10-31     5.0 2019-03-30