I have been attempting to write some routines to read RSS and ATOM feeds using the new routines available in System.ServiceModel.Syndication, but unfortunately the Rss20FeedFormatter bombs out on about half the feeds I try with the following exception:
An error was encountered when parsing a DateTime value in the XML.
This seems to occur whenever the RSS feed expresses the publish date in the following format:
Thu, 16 Oct 08 14:23:26 -0700
If the feed expresses the publish date as GMT, things go fine:
Thu, 16 Oct 08 21:23:26 GMT
If there's some way to work around this with XMLReaderSettings, I have not found it. Can anyone assist?
RSS 2.0 formatted syndication feeds utilize the RFC 822 date-time specification when serializing elements like pubDate and lastBuildDate. The RFC 822 date-time specification is unfortunately a very 'flexible' syntax for expressing the time-zone component of a DateTime.
Time zone may be indicated in several ways. "UT" is Universal Time (formerly called "Greenwich Mean Time"); "GMT" is permitted as a reference to Universal Time. The military standard uses a single character for each zone. "Z" is Universal Time. "A" indicates one hour earlier, and "M" indicates 12 hours earlier; "N" is one hour later, and "Y" is 12 hours later. The letter "J" is not used. The other remaining two forms are taken from ANSI standard X3.51-1975. One allows explicit indication of the amount of offset from UT; the other uses common 3-character strings for indicating time zones in North America.
I believe the issue involves how the zone component of the RFC 822 date-time value is being processed. The feed formatter appears to not be handling date-times that utilize a local differential to indicate the time zone.
As RFC 1123 extends the RFC 822 specification, you could try using the DateTimeFormatInfo.RFC1123Pattern ("r") to handle converting problamatic date-times, or write your own parsing code for RFC 822 formatted dates. Another option would be to use a third party framework instead of the System.ServiceModel.Syndication namespace classes.
It appears there are some known issues with date-time parsing and the Rss20FeedFormatter that are in the process of being addressed by Microsoft.