Search code examples
javaregexdatelocaldate

(Re)Use DateTimeFormatter for parsing date ranges or mix DateTimeFormatter with regex


I have the following String representing a date range which I need to parse:

2018-10-20:2019-10-20

It consists of 2 ISO date strings separated by :

The string can get more complex by having repeated date ranges mixed with other text. This can be done by a Regex.

However, given that the latest Java has Date/Time support that most coders here and elsewhere are ecstatic about, is it possible to use, say, LocalDate's parser or a custom DateTimeFormatter in order to identify the bits in my String which are candidates for ISO-date and capture them?

Better yet, how can I extract the validation regex from a DateTimeFormatter (the regex which identifies an ISO-date, assuming there is one) and merge/compile it with my own regex for the rest of the String.

I just do not feel comfortable coding yet another ISO-date regex in my code when possibly there is already a regex in Java which does that and I just re-use it.

Please note that I am not asking for a regex. I can do that.

Please also note that my example String can contain other date/time formats, e.g. with timezones and milliseconds and all the whistles.


Solution

  • Actually, DateTimeFormatter doesn't have an internal regex. It uses a CompositePrinterParser, which in turn uses an array of DateTimePrinterParser instances (which is an inner interface of DateTimeFormatterBuilder), where each instance is responsible for parsing/formatting a specific field.

    IMO, regex is not the best approach here. If you know that all dates are separated by :, why not simply split the string and try to parse the parts individually? Something like that:

    String dates = // big string with dates separated by :
    
    DateTimeFormatter parser = // create a formatter for your patterns
    for (String s : dates.split(":")) {
        parser.parse(s); // if "s" is invalid, it throws exception
    }
    

    If you just want to validate the strings, calling parse as above is enough - it'll throw an exception if the string is invalid.

    To support multiple formats, you can use DateTimeFormatterBuilder::appendOptional. Example:

    DateTimeFormatter parser = new DateTimeFormatterBuilder()
        // full ISO8601 with date/time and UTC offset (ex: 2011-12-03T10:15:30+01:00)
        .appendOptional(DateTimeFormatter.ISO_OFFSET_DATE_TIME)
        // date/time without UTC offset (ex: 2011-12-03T10:15:30)
        .appendOptional(DateTimeFormatter.ISO_LOCAL_DATE_TIME)
        // just date (ex: 2011-12-03)
        .appendOptional(DateTimeFormatter.ISO_LOCAL_DATE)
        // some custom format (day/month/year)
        .appendOptional(DateTimeFormatter.ofPattern("dd/MM/yyyy"))
        // ... add as many you need
        // create formatter
        .toFormatter();
    

    A regex to support multiple formats (as you said, "other date/time formats, e.g. with timezones and milliseconds and all the whistles") is possible, but the regex is not good to validate the dates - things like day zero, day > 30 is not valid for all months, February 29th in non-leap years, minutes > 60 etc.

    A DateTimeFormatter will validate all these tricky details, while a regex will only guarantee that you have numbers and separators in the correct position and it won't validate the values. So regardless of the regex, you'll have to parse the dates anyway (which, IMHO, makes the use of regex pretty useless in this case).