I have a dataset where the dates are all in yyyy-mm-ddT00:00:00.000Z format. I need to calculate the difference between the two date fields.
I have tried using dt.days (below) to perform the difference equation but I got a type Error.
TypeError: unsupported operand type(s) for -: 'str' and 'str'
I have also tried to use dt.datetime.strptime(df['columnname'], "%y%m%d").date()
. Any other ideas of what I can do?
# import packages
# python
import pandas as pd
import numpy as np
import datetime as dt
from datetime import timedelta
from datetime import datetime
import os
import tempfile
import pymssql
from dateutil.relativedelta import relativedelta
import tempfile
# initialize data of lists.
data = {'ID': ['089', '983', '037', '654'],
'Schedule_Date': ['2024-02-06T00:00:00.000Z', '2024-03-17T00:00:00.000Z', '2024-02-02T00:00:00.000Z', '2024-08-14T00:00:00.000Z'],
'Out_Date': ['2024-02-08T00:00:00.000Z', '2024-04-27T00:00:00.000Z', '2024-05-24T00:00:00.000Z', '2024-02-26T00:00:00.000Z']}
# Creates pandas DataFrame.
df = pd.DataFrame(data, index=['rank1',
'rank2',
'rank3',
'rank4'])
df['days_alert']= (df['Out_Date'] - df['Schedule_Date']).dt.days
The to_datetime
function is your friend here. Your date columns are just strings. To do date arithmetic, they have to be converted to date/time types.
import pandas as pd
# initialize data of lists.
data = {'ID': ['089', '983', '037', '654'],
'Schedule_Date': ['2024-02-06T00:00:00.000Z', '2024-03-17T00:00:00.000Z', '2024-02-02T00:00:00.000Z', '2024-08-14T00:00:00.000Z'],
'Out_Date': ['2024-02-08T00:00:00.000Z', '2024-04-27T00:00:00.000Z', '2024-05-24T00:00:00.000Z', '2024-02-26T00:00:00.000Z']}
# Creates pandas DataFrame.
df = pd.DataFrame(data, index=['rank1',
'rank2',
'rank3',
'rank4'])
df['Schedule_Date'] = pd.to_datetime(df['Schedule_Date'])
df['Out_Date'] = pd.to_datetime(df['Out_Date'])
df['days_alert']= (df['Out_Date'] - df['Schedule_Date']).dt.days
print(df)
Output:
ID Schedule_Date Out_Date days_alert
rank1 089 2024-02-06 00:00:00+00:00 2024-02-08 00:00:00+00:00 2
rank2 983 2024-03-17 00:00:00+00:00 2024-04-27 00:00:00+00:00 41
rank3 037 2024-02-02 00:00:00+00:00 2024-05-24 00:00:00+00:00 112
rank4 654 2024-08-14 00:00:00+00:00 2024-02-26 00:00:00+00:00 -170