Search code examples
c#xpathhtml-agility-pack

How to select node where descendant node has attribute


I am parsing an html email. I need to get the href property from the following html:

 <a href="https://sample.com/us/en/suv-rental/united-states/orlando-fl/jeep/grand-cherokee/12345" target="_blank">
     <img class="m_3371787045960899181vehicle-image" width="360" height="216" src="https://images.sample.com/media/vehicle/images/12345.620x372.jpg" alt="Jeep Grand Cherokee" title="Jeep Grand Cherokee">
 </a>

The only way to select it is to find the a which has an image, which has a src which includes 'https://images.sample.com'

What I need is: https://sample.com/us/en/suv-rental/united-states/orlando-fl/jeep/grand-cherokee/12345

I am struggling to get this to work. This is what I have so far:

 HtmlNode vehicleNode = document.DocumentNode.SelectNodes("//a").Where(x => x.DescendantNodes.Attributes["src"].Value.Contains("images.sample.com")).First();

But this does not compile, as you cannot use x.DescendantNodes... but I cannot find the correct way to do this.

So how to select using a decendant node property?


Solution

  • It seems, in terms of XPath you can use //a[img/@src[starts-with(., 'https://images.sample.com')]].