Search code examples
pythonpandaspython-datetime

Pandas time diff: Timestamp subtraction must have the same timezones or no timezones


When I try to calculate de difference between date series and today I get Timestamp subtraction must have the same timezones or no timezones error.

Loading data

raw_data = pd.read_json('resultados_finales_completos10000.json')
print(raw_data['fecha_publicacion'][0])

2017-09-24T15:04:22.000Z

Turn object type column to datetime

raw_data['fecha_publicacion'] =  pd.to_datetime(raw_data['fecha_publicacion'])
print(raw_data['fecha_publicacion'][0])
print(raw_data['fecha_publicacion'][0].tzinfo, type(today.tzinfo))

2017-09-24 15:04:22+00:00

UTC <class 'datetime.timezone'>

Then I set today's value

today = datetime.now(tz=timezone.utc)
print(today)
print(today.tzinfo, type(today.tzinfo))

2021-08-13 21:31:16.031605+00:00

UTC <class 'datetime.timezone'>

In both cases I have the same timezone settings.

Finaly I'm trying to create a new column to store the time difference as follows and get the fore mentioned error.

raw_data['meses_venta'] = today - raw_data['fecha_publicacion']

I tryied the following posts with not much success. Any clues welcome. Thanks in advance!


Solution

  • You have to set the same timezone or (no timezone) as the error says.

    One way to do so:

    from datetime import datetime, timezone
    import pandas as pd
    x = pd.to_datetime('2017-09-24T15:04:22.000Z')
    today = datetime.now(tz=x.tz)
    
    today - x # Timedelta('1419 days 06:59:48.134906')