Search code examples
c#html-agility-pack.net-7.0

HTML Agility Pack not matching xpath where attribute name starts-with


I am trying to select all HTML nodes where the node contains an attribute with a name starting with 'on'.

This is what I have for the XPath:

//*[@*[starts-with(name(), 'on')]]

I am getting null back when calling SelectNodes with the above xpath and html <div onclick="alert('test');"></div>.

var document = new HtmlDocument();

document.LoadHtml("<div onclick=\"alert('test');\"></div>");

var nodes = document.DocumentNode.SelectNodes("//*[@*[starts-with(name(), 'on')]]");

I have tested the XPath on a couple of XPath testing sites (https://www.freeformatter.com/xpath-tester.html#before-output and http://xpather.com/) and they all return the div node. Do xpath functions not work with HTML Agility Pack? Do I need to do something different for the HTML Agility Pack?


Solution

  • Seems to work if you use the local-name() function instead. I think it is a bug in HtmlAgilityPack. If you look at the implementation of HtmlNodeNavigator.LocalName here, it accounts for _attIndex being set and returns the name of the corresponding attribute. However, the implementation of HtmlNodeNavigator.Name that follows here does not account at all for _attIndex. So in your case, even though the HtmlNodeNavigator may currently be pointing to the attribute itself via _attIndex, the call to the .Name property erroneously returns div instead of onclick.

    I would file an issue in their GitHub repo.