Search code examples
c#xmllinq-to-xmlxdoc

Using LINQ to query XDocument, how to get specific values?


I'm trying to refactor the following - which works, but if I start to get more elements in the XML it'll get unmanageable :

HttpResponseMessage response = await httpClient.GetAsync("https://uri/products.xml");

string responseAsString = await response.Content.ReadAsStringAsync();

List<Product> productList = new List<Product>();

XDocument xdocument = XDocument.Parse(responseAsString);
var products = xdocument.Descendants().Where(p => p.Name.LocalName == "item");

foreach(var product in products)
{
    var thisProduct = new Product();
    foreach (XElement el in product.Nodes())
    {
        if(el.Name.LocalName == "id")
        {
            thisProduct.SKU = el.Value.Replace("-master", "");
        }
        if (el.Name.LocalName == "availability")
        {
            thisProduct.Availability = el.Value == "in stock";
        }
    }
    productList.Add(thisProduct);
}

Given the following XML URL

<rss xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
    xmlns="http://base.google.com/ns/1.0" version="0">
    <channel>
        <title>Product Feed</title>
        <link></link>
        <description>Products</description>
        <item>
            <availability>in stock</availability>
            <id>01234-master</id>
            ...
        </item>
        <item>
            <availability>in stock</availability>
            <id>abcde-master</id>
            ...
        </item>
    </channel>
</rss>

Ideally I'd like to remove the loops and if statements and have a LINQ query that returns only the fields I need (id, availability etc..) from the XML in a nice clean way and populate a simple class with this data.

Can anyone help?


Solution

  • Sometimes you have to be happy for the code you have written. Sometimes there is no "smarter" way to write it... You can only write it a little "better":

    List<Product> productList = new List<Product>();
    
    XDocument xdocument = XDocument.Parse(responseAsString);
    
    XNamespace ns = "http://base.google.com/ns/1.0";
    
    var products = from x in xdocument.Elements(ns + "rss")
                   from y in x.Elements(ns + "channel")
                   from z in y.Elements(ns + "item")
                   select z;
    
    foreach (var product in products)
    {
        var prod = new Product();
        productList.Add(prod);
    
        foreach (XElement el in product.Elements())
        {
            if (el.Name == ns + "id")
            {
                prod.SKU = el.Value.Replace("-master", string.Empty);
            }
            else if (el.Name == ns + "availability")
            {
                prod.Availability = el.Value == "in stock";
            }
        }
    }
    

    Notes:

    • The Descendants() is morally wrong. There is a fixed position where the item will be, /rss/channel/item, and you know it perfectly well. It isn't //item. Because tomorrow there could be a rss/foo/item that today doesn't exist. You try to write your code so that it is forward compatible with additional informations that could be added to the xml.
    • I do hate xml namespaces... And there are xml with multiple nested namespaces. How much I hate those. But someone more intelligent than me decided that they exist. I accept it. I code using them. In LINQ-to-XML it is quite easy. There is a XNamespace that even has an overloaded + operator.

      Note that if you are a micro-optimizer (I try not to be, but I have to admit, but my hands are itching a little), you can pre-calculate the various ns + "xxx" that are used inside the for cycle, because it isn't clear from here, but they are all rebuilt every cycle. An how a XName is built inside... oh... that is a fascinating thing, trust me.

      private static readonly XNamespace googleNs = "http://base.google.com/ns/1.0";
      private static readonly XName idName = googleNs + "id";
      private static readonly XName availabilityName = googleNs + "availability";
      

      and then

      if (el.Name == idName)
      {
          prod.SKU = el.Value.Replace("-master", string.Empty);
      }
      else if (el.Name == availabilityName)
      {
          prod.Availability = el.Value == "in stock";
      }