Search code examples
pythondateparser

Exclude relative dates when parsing text with the dateparser.search module


Trying to get search_dates from dateparser.search ignore relative dates like "tomorrow", "next week", "more", etc.

Currently, here's the behavior I'm getting:

from dateparser.search import search_dates

In: search_dates("more", settings={"STRICT_PARSING": True})

Out: [("more", datetime.datetime(2021, 4, 27, 11, 21, 45, 998830))]

In: search_dates("March 15, 2020", settings={"STRICT_PARSING": True})

Out: [("March 15, 2020", datetime.datetime(2020, 3, 15, 0, 0))]

I'm expecting:

from dateparser.search import search_dates

In: search_dates("more", settings={"STRICT_PARSING": True})

Out: [("more", None)]

In: search_dates("March 15, 2020", settings={"STRICT_PARSING": True})

Out: [("March 15, 2020", datetime.datetime(2020, 3, 15, 0, 0))]

Solution

  • In order to do this you need to exclude the relative-time parser from the list of parsers used by search_dates.

    from dateparser_data.settings import default_parsers
    from dateparser.search import search_dates
    
    # you start by creating a list of all parsers minus the relative-time parser
    parsers = [parser for parser in default_parsers if parser != 'relative-time']
    
    # then you pass the list you just created to the settings
    search_dates('today', settings={'PARSERS': parsers})
    

    Using your examples:

    In: search_dates("more", settings={"STRICT_PARSING": True, , 'PARSERS': parsers})
    
    Out: None
    
    In: search_dates("March 15, 2020", settings={"STRICT_PARSING": True, 'PARSERS': parsers})
    
    Out: [('March 15, 2020', datetime.datetime(2020, 3, 15, 0, 0))]
    

    docs