I have a data frame with multiple columns and multiple rows. In one of these columns there are dates that take the form of mm/dd/yyyy
.
I am trying to convert this using df['col'] = pd.to_datetime(df['col'])
but am getting the following error because there are multiple records that have 00
in the place of a missing month or day:
ValueError: day is out of range for month
I don't want to do df['col'] = pd.to_datetime(df['col'], errors = 'coerce')
because I want to keep whatever data is there.
I would like all the dates that are missing days or months or both (e.g 11/00/2018, 00/13/2018, or 00/00/2018)
to have the value 01
where the value is missing (e.g 11/01/2018, 01/13/2018, 01/01/2018
).
You could use the following regex to replace 00
:
import pandas as pd
data = ['11/00/2018', '00/13/2018', '00/00/2018']
df = pd.DataFrame(data=data, columns=['col'])
replace = df['col'].replace('00/', '01/', regex=True)
result = pd.to_datetime(replace)
print(result)
Output
0 2018-11-01
1 2018-01-13
2 2018-01-01
Name: col, dtype: datetime64[ns]