Search code examples
pythonpython-datetime

Unexpected behaviour datetime.strptime parse when the format string lacks spaces


Trying to parse and validate a date and hour that has to have "yyyymmddhh" format. I want the function to raise an exception if the string does not conform the specified format, so I test two ill formed strings that hasn't the hour part:

Test 1. Results as expected

>>> datetime.strptime("20230609", "%Y%m%d%H")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/miniconda3/lib/python3.10/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/home/user/miniconda3/lib/python3.10/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data '20230609' does not match format '%Y%m%d%H'

Test 2. Bug?

Only changing date from June 9th to June 10th:

>>> datetime.strptime("20230610", "%Y%m%d%H")
datetime.datetime(2023, 6, 1, 0, 0)

As I understand, %Y, %m, %d and %H expect zero padded fixed length numbers with a total of 10 chars, so the lack of spaces shoudn't fool the parser. Am i mistaken?

Tested on python 3.7 and 3.10.


Solution

  • Note 9 in the documentation indicates the leading 0 is optional with strptime:

    When used with the strptime() method, the leading zero is optional for formats %d, %m, %H, %I, %M, %S, %j, %U, %W, and %V. Format %y does require a leading zero.

    So strptime takes advantage of the fact that it can consume 2023 with %Y, 06 with %m, but only 1, not 10, with %d, leaving 0 to match %H.

    With 20230609, 0 is not a valid day or month, so there is no interpretation that allows %Y%m%d to consume fewer than 8 characters, leaving nothing for %H.