Search code examples
python-3.xstringdatenlpdatefinder

datefinder wont find dates when the string has ':' before date


datefinder module doesn't find dates when there is ':' before date.

There is a similar question here: Datefinder Module Stranger behavior on particular string

string = "Assessment Date 17-May-2017 at 13:31"

list(datefinder.find_dates(string.lower()))
#Returns [datetime.datetime(2017, 5, 17, 13, 31)]

However when I add : like this "Assessment Date:", it fails

string = "Assessment Date 17-May-2017 at 13:31"
list(datefinder.find_dates(string.lower()))
#returns []

Solution

  • These are the delimiters patterns in datefinder: DELIMITERS_PATTERN = r"[/:-\,\s_+@]+"

    So 'Date:' is causing an issue when you try to parse the string.

    You could preclean the string using a regular expression.

    import re as regex
    import datefinder
    
    def preclean_input_text(text):
      cleaned_text = regex.sub(r'[a-z]:\s', ' ', text, flags=re.IGNORECASE)
      return cleaned_text
    
    def parse_date_information(text):
      date_info = list(datefinder.find_dates(text.lower()))
      return date_info
    
    string = "Assessment Date: 17-May-2017 at 13:31"
    cleaned_string = preclean_input_text(string)
    print(parse_date_information(cleaned_string))
    # output
    [datetime.datetime(2017, 5, 17, 13, 31)]