Search code examples
javaparsingrssjodatimerss-reader

joda - parsing time zone of pubDate in RSS item


I'm parsing pubDate in RSS item using Joda. The date have to be in RFC-822 format: http://feed2.w3.org/docs/error/InvalidRFC2822Date.html

The problem is that when there is a date like: Wed, 02 Oct 2002 13:00:00 GMT I have to use pattern:

DateTimeFormat.forPattern("EEE, dd MMM yyyy HH:mm:ss ZZZ").withLocale(Locale.ENGLISH).withOffsetParsed();

But it can be also date like: Wed, 02 Oct 2002 15:00:00 +0200. In this case ZZZ dosen't work, I have to use one Z:

DateTimeFormat.forPattern("EEE, dd MMM yyyy HH:mm:ss Z").withLocale(Locale.ENGLISH).withOffsetParsed();

How to create universal solution?


Solution

  • I've made tests with JodaTime 2.7 and found 2 ways to do it:

    1. Use DateTimeFormatterBuilder's optional parsers:

      // create parser for "GMT"
      DateTimeParser gmtParser = DateTimeFormat.forPattern("ZZZ").getParser();
      
      // create parser for "+0200"
      DateTimeParser offsetParser = DateTimeFormat.forPattern("Z").getParser();
      
      DateTimeFormatter formatter = new DateTimeFormatterBuilder()
          .appendPattern("EEE, dd MMM yyyy HH:mm:ss ") // common pattern
          .appendOptional(gmtParser)    // optional parser for GMT
          .appendOptional(offsetParser) // optional parser for +0200
          .toFormatter().withLocale(Locale.ENGLISH).withOffsetParsed();
      
    2. DateTimeFormatterBuilder can receive an array of parsers that can be used to parse different inputs:

      // create array with all possible patterns
      DateTimeParser[] parsers = {
          DateTimeFormat.forPattern("EEE, dd MMM yyyy HH:mm:ss Z").getParser(),
          DateTimeFormat.forPattern("EEE, dd MMM yyyy HH:mm:ss ZZZ").getParser()
      };
      
      // create a formatter using the parsers array
      DateTimeFormatter formatter = new DateTimeFormatterBuilder()
          .append(null, parsers) // use parsers array
          .toFormatter().withLocale(Locale.ENGLISH).withOffsetParsed();
      

    Using any of the solutions above, the formatter will work with both inputs:

    System.out.println(formatter.parseDateTime("Wed, 02 Oct 2002 13:00:00 GMT"));
    System.out.println(formatter.parseDateTime("Wed, 02 Oct 2002 15:00:00 +0200"));
    

    The output will be:

    2002-10-02T13:00:00.000Z
    2002-10-02T15:00:00.000+02:00
    

    Note: I believe the first solution is better if you have a common part among all patterns and little variation between them. The second solution is better if the patterns are very different from each other. But I also believe it's a matter of opinion and it's up to you to choose.