Search code examples
c#xpathhtml-agility-pack

XPath to Capture Alternating Rows


Using HtmlAgilityPack, I am trying to capture rows of a table where the row class name alternates. Snippet below:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
var documentNode = doc.DocumentNode;
var lstNodes = documentNode.SelectNodes("//table[@class='rgMasterTable']");
var tableNode = lstNodes[0];
var rows = tableNode.SelectNodes("//tr[@class='rgRow dnnGridItem'|@class='rgAltRow dnnGridAltItem']");

On the last line, I am trying to say "Give me rows where the class is either rgRow dnnGridItem or rgAltRow dnnGridAltItem. However, I get the following exception:

Exception thrown: 'System.Xml.XPath.XPathException' in System.Xml.dll

Additional information: Expression must evaluate to a node-set.

The source of the HTML is available here: http://www.terna.it/it-it/sistemaelettrico/remit.aspx

Any assistance on the correct XPath query greatly appreciated.


Solution

  • Thanks to link from @LukášDoležal : Select on multiple criteria with XPath

    The UNION should be on the nodes, not on the class specification.

    var rows = tableNode.SelectNodes("//tr[@class='rgRow dnnGridItem'|@class='rgAltRow dnnGridAltItem']");
    

    becomes

    var rows = tableNode.SelectNodes("//tr[@class='rgRow dnnGridItem'] | //tr[@class='rgAltRow dnnGridAltItem']");
    

    or shorter yet (thanks to @splash58):

    var rows = tableNode.SelectNodes("//tr[@class='rgRow dnnGridItem' or @class='rgAltRow dnnGridAltItem']");