Search code examples
pythonpandasdatedatetimesubtraction

pandas: subtracting current date from the date in a pandas table


I am attempting to calculate the difference in days between todays and a pandas data consisting of historical data. Below is the intended code:

df['diff'] = pd.to_datetime( df['date']) - pd.datetime.now().date()

However, it produces the following error:

TypeError: unsupported operand type(s) for -: 'DatetimeIndex' and 'datetime.date'

The date column in the pandas table looks like this:

0       2018-12-18
1       2018-12-18
2       2018-12-18
3       2018-12-18
4       2018-12-18

How do I fix this error. Thanks in advance.


Solution

  • You have to subtract same types - datetimes with datetime (with zero times) or dates with date.

    Use Timestamp.now with Timestamp.normalize or Timestamp.floor for remove times:

    df['diff'] = pd.to_datetime( df['date']) - pd.Timestamp.now().normalize() 
    
    df['diff'] = pd.to_datetime( df['date']) - pd.Timestamp.now().floor('d')
    

    You can also use replace:

    dt = pd.datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
    df['diff'] = pd.to_datetime( df['date']) - dt
    

    Or convert Datetimes to dates for subtract same types:

    dt = datetime.datetime.now().date()
    df['diff'] = pd.to_datetime(df['date']).dt.date - dt
    

    Sample:

    rng = pd.date_range('2018-04-03', periods=10, freq='100D')
    df = pd.DataFrame({'date': rng}) 
    
    df['diff'] = pd.to_datetime( df['date']) - pd.Timestamp.now().normalize() 
    print (df)
            date      diff
    0 2018-04-03 -261 days
    1 2018-07-12 -161 days
    2 2018-10-20  -61 days
    3 2019-01-28   39 days
    4 2019-05-08  139 days
    5 2019-08-16  239 days
    6 2019-11-24  339 days
    7 2020-03-03  439 days
    8 2020-06-11  539 days
    9 2020-09-19  639 days