When I try to calculate de difference between date series and today I get Timestamp subtraction must have the same timezones or no timezones
error.
Loading data
raw_data = pd.read_json('resultados_finales_completos10000.json')
print(raw_data['fecha_publicacion'][0])
2017-09-24T15:04:22.000Z
Turn object type column to datetime
raw_data['fecha_publicacion'] = pd.to_datetime(raw_data['fecha_publicacion'])
print(raw_data['fecha_publicacion'][0])
print(raw_data['fecha_publicacion'][0].tzinfo, type(today.tzinfo))
2017-09-24 15:04:22+00:00
UTC <class 'datetime.timezone'>
Then I set today's value
today = datetime.now(tz=timezone.utc)
print(today)
print(today.tzinfo, type(today.tzinfo))
2021-08-13 21:31:16.031605+00:00
UTC <class 'datetime.timezone'>
In both cases I have the same timezone settings.
Finaly I'm trying to create a new column to store the time difference as follows and get the fore mentioned error.
raw_data['meses_venta'] = today - raw_data['fecha_publicacion']
I tryied the following posts with not much success. Any clues welcome. Thanks in advance!
You have to set the same timezone or (no timezone) as the error says.
One way to do so:
from datetime import datetime, timezone
import pandas as pd
x = pd.to_datetime('2017-09-24T15:04:22.000Z')
today = datetime.now(tz=x.tz)
today - x # Timedelta('1419 days 06:59:48.134906')