I am trying to convert a series of date strings to dates. But I found that even with 'exact' parameter set to True, when there is no day value in the string the .to_datetime is adding a default value of 01.
PS: I am using pandas 0.24.2
data = np.array(['2014-10','2015-06/07','20-14-08','2a115-09'])
ser = pd.Series(data)
pd.to_datetime(ser, errors='coerce', format='%Y-%m-%d', exact=True)
#Result:
0 2014-10-01
1 NaT
2 NaT
3 NaT
dtype: datetime64[ns]
#Expected
0 NaT
1 NaT
2 NaT
3 NaT
dtype: datetime64[ns]
You can try to conditionally check if the date format is present with str.match
and np.where
:
notice I added one more date to showcase the correct result
data = np.array(['2014-10','2015-06/07','20-14-08','2a115-09', '2018-09-20'])
m = pd.Series(data).str.match('\d{4}-\d{2}-\d{2}')
data_new = pd.Series(np.where(m,
pd.to_datetime(data, format='%Y-%m-%d',errors='coerce'),
np.datetime64('NaT')))
0 NaT
1 NaT
2 NaT
3 NaT
4 2018-09-20
dtype: datetime64[ns]