I am getting my data and some dates from an unconventional source and because of this there are some minor differences in the string dates. the big difference is that there are dates mixed in where the day is not padded by a zero, there can be a white space after the day (in the case of date 2/9 /2018) also the months are not padded by zeroes. I was getting the error that "time data does not match format '%m %d %Y' when trying datetime.strptime. how can I convert a column of dates where there are subtle differences like this? please see the code and sample data below
d_o = datetime.datetime.strptime(df['start'][1], '%m %d %Y')
You should use a 3rd party library such as dateutil
. This library accepts a wide variety of date formats at the cost of performance.
from dateutil import parser
lst = ['1/26/2018', '1/26/2018', '2/2/2018', '2/2/2018', '2/9 /2018', '2/9 /2018',
'1/19/2018', '1/19/2018', '1/26/2018', '1/26/2018', '2/2/2018', '2/2/2018',
'2/9 /2018']
res = [parser.parse(i) for i in lst]
Result:
[datetime.datetime(2018, 1, 26, 0, 0),
datetime.datetime(2018, 1, 26, 0, 0),
datetime.datetime(2018, 2, 2, 0, 0),
datetime.datetime(2018, 2, 2, 0, 0),
datetime.datetime(2018, 2, 9, 0, 0),
datetime.datetime(2018, 2, 9, 0, 0),
datetime.datetime(2018, 1, 19, 0, 0),
datetime.datetime(2018, 1, 19, 0, 0),
datetime.datetime(2018, 1, 26, 0, 0),
datetime.datetime(2018, 1, 26, 0, 0),
datetime.datetime(2018, 2, 2, 0, 0),
datetime.datetime(2018, 2, 2, 0, 0),
datetime.datetime(2018, 2, 9, 0, 0)]