I need to rewrite some sql code to python, and my problem is necessity of calculation differences in days:
As you can see, for cases with final_pmt_date ‘9999-12-31’, the dates subtracted easily.
But in pandas there is limit for datetime64 type, so I get exception:
All answers I saw were about converting this dates into NaN (with ‘coerce’ keyword). But I need to calculate number of days for such datetimes also.
Thank you in advance
A date like 9999-12-31
is out of range for pandas datetime.
Using vanilla Python datetime might be an alternative here, e.g. like
from datetime import datetime
import pandas as pd
df = pd.DataFrame(
{
"open": ["2021-12-27 00:00:00.000", "2019-03-06 00:00:00.000"],
"close": ["9999-12-31 00:00:00.000", "2022-04-06 00:00:00.000"],
}
)
df["delta"] = df.apply(
(
lambda row: datetime.fromisoformat(row["close"])
- datetime.fromisoformat(row["open"]),
),
axis=1,
)
df
open close delta
0 2021-12-27 00:00:00.000 9999-12-31 00:00:00.000 2913908 days, 0:00:00
1 2019-03-06 00:00:00.000 2022-04-06 00:00:00.000 1127 days 00:00:00
However note that you'll have to use an apply
which is not very efficient compared to the "vectorized" pandas datetime methods. Maybe using NaT
as an "invalid-value-identfier" is an option after all?