Search code examples
javadatedatetimeformatter

Migrating from SimpleDateFormat to DateTimeFormater: extra input


I am migrating some old code from SimpleDateFormat to DateTimeFormatter. (Apache MIME4J library, which would unlock significant performance gains!)

Being working in the email field, I need to comply with RFC-5322 and came up with the following formatter:

    public static final DateTimeFormatter RFC_5322 = new DateTimeFormatterBuilder()
        .parseCaseInsensitive()
        .parseLenient()
        .optionalStart()
            .appendText(DAY_OF_WEEK, dayOfWeek())
            .appendLiteral(", ")
        .optionalEnd()
        .appendValue(DAY_OF_MONTH, 1, 2, SignStyle.NOT_NEGATIVE)
        .appendLiteral(' ')
        .appendText(MONTH_OF_YEAR, monthOfYear())
        .appendLiteral(' ')
        .appendValueReduced(YEAR, 2, 4, INITIAL_YEAR)
        .appendLiteral(' ')
        .appendValue(HOUR_OF_DAY, 2)
        .appendLiteral(':')
        .appendValue(MINUTE_OF_HOUR, 2)
        .optionalStart()
            .appendLiteral(':')
            .appendValue(SECOND_OF_MINUTE, 2)
        .optionalEnd()
        .optionalStart()
            .appendLiteral('.')
            .appendValue(MILLI_OF_SECOND, 3)
        .optionalEnd()
        .optionalStart()
            .appendLiteral(' ')
            .appendOffset("+HHMM", "GMT")
        .optionalEnd()
        .optionalStart()
            .appendLiteral(' ')
            .appendOffsetId()
        .optionalEnd()
        .optionalStart()
            .appendLiteral(' ')
            .appendPattern("0000")
        .optionalEnd()
        .optionalStart()
            .appendLiteral(' ')
            .appendPattern("(zzz)")
        .optionalEnd()
        .toFormatter()
        .withZone(ZoneId.of("GMT"));

Which works great with input like Thu, 4 Oct 2001 20:12:26 -0700 (PDT).

However some borderline emails to have extra characters after: Date: Thu, 4 Oct 2001 20:12:26 -0700 (PDT),Thu, 4 Oct 2001 20:12:26 -0700 and makes the parsing fail...

I would like some kind of wildcards to say "and now you are free to ignore extra input"...

The previous version based on SimpleDateFormat was handling this nicely...

Here is a link to the pull request: https://github.com/apache/james-mime4j/pull/44

Thank you in advance for your help!


Solution

  • DateTimeFormatter#parse(CharSequence, ParsePosition) is at your disposal.

    Demo:

    import java.text.ParsePosition;
    import java.time.LocalDate;
    import java.time.format.DateTimeFormatter;
    import java.util.Locale;
    
    public class Main {
        public static void main(String[] args) {
            String s = "08/01/2021&&";
            DateTimeFormatter dtf = DateTimeFormatter.ofPattern("MM/dd/uuuu", Locale.ENGLISH);
            LocalDate date = LocalDate.from(dtf.parse(s, new ParsePosition(0)));
            System.out.println(date);
        }
    }
    

    Output:

    2021-08-01
    

    ONLINE DEMO