Search code examples
c#xmlasp.net-mvcrssxmltextreader

Can't read RSS feed with xmlTextReader - "A column named 'link' already belongs to this DataTable"


I've been using an xmlDataReader to read RSS for many years, but all of a sudden two feeds I've use have introduced an extra line which is tripping up the xmlDataReader parser.

The problem is the second line here conflicts with the first:

<link>http://www.eventjobsearch.co.uk/jobsrss/</link>
<atom:link href="http://www.eventjobsearch.co.uk/jobsrss/" rel="self" type="application/rss+xml"/>

The parser thinks the atom:link element is a duplicate of the link element. I don't personally need the atom:link line but as I'm using a stream, I can't see any way to remove this line or remove the colon (which would solve the problem).

How can I get rid of the colon in the stream so the built in parser works again?

 HttpWebRequest req = (HttpWebRequest)WebRequest.Create(WebConfigurationManager.AppSettings["XmlJobsFeedUrl"]);
 req.UserAgent = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)"; 

 WebResponse response = req.GetResponse();
 Stream stream = response.GetResponseStream();
 XmlTextReader xmlTextReader = new XmlTextReader(stream);
 DataSet jobs = new DataSet("Jobs");
 jobs.ReadXml(xmlTextReader);

Solution

  • Please see this question and solution. Straight before calling jobs.ReadXml(...), you can read the schema:

    jobs.ReadXmlSchema("http://www.thearchitect.co.uk/schemas/rss-2_0.xsd");
    

    It's probably recommended to copy the xsd file to your own server.