I am working on old text files with 2-digit years where the default century logic in dateutil.parser
doesn't seem to work well. For example, the attack on Pearl Harbor was not on dparser.parse("12/7/41")
(which returns 2041-12-7).
The buit-in century "threshold" to roll back into the 1900's seems to happen at 66:
import dateutil.parser as dparser
print(dparser.parse("12/31/65")) # goes forward to 2065-12-31 00:00:00
print(dparser.parse("1/1/66")) # goes back to 1966-01-01 00:00:00
For my purposes I would like to set this "threshold" at 17, so that:
"12/31/16"
parses to 2016-12-31 (yyyy-mm-dd
)"1/1/17"
parses to 1917-01-01But I would like to continue to use this module as its fuzzy match seems to be working well.
The documentation doesn't identify a parameter for doing this... is there an argument I'm overlooking?
This isn't particularly well documented but you can actually override this using dateutil.parser
. The second argument is a parserinfo
object, and the method you'll be concerned with is convertyear
. The default implementation is what's causing you problems. You can see that it is basing its interpretation of the century on the current year, plus or minus fifty years. That's why you're seeing the transition at 1966. Next year it will be 1967. :)
Since you are using this personally and may have very specific needs, you don't have to be super-generic. You could do something as simple as this if it works for you:
from dateutil.parser import parse, parserinfo
class MyParserInfo(parserinfo):
def convertyear(self, year, *args, **kwargs):
if year < 100:
year += 1900
return year
parse('1/21/47', MyParserInfo())
# datetime.datetime(1947, 1, 21, 0, 0)