how to get value of a tag that has no class or id in html agility pack?

I am trying to get the text value of this a tag:

<a href="item?id=22513425">67 comments</a>

so i'm trying to get '67' from this. however there are no defining classes or id's.

i've managed to get this far:

        IEnumerable<HtmlNode> commentsNode = htmlDoc.DocumentNode.Descendants(0).Where(n => n.HasClass("subtext"));

        var storyComments = commentsNode.Select(n =>
            n.SelectSingleNode("//a[3]")).ToList();

this only give me "comments" annoyingly enough.

I can't use the href id as there are many of these items, so i cant hardcord the href

how can i extract the number aswell?

Solution

Just use the @href attribute and a dedicated string function :

substring-before(//a[@href="item?id=22513425"],"comments")

returns 67.

EDIT : Since you can't hardcode all the content of @href, maybe you can use starts-with. XPath 1.0 solution.

Shortest form (+ text has to contain "comments") :

substring-before(//a[starts-with(@href,"item?") and text()[contains(.,"comments")]],"c")

More restrictive (+ text has to finish with "comments") :

substring-before(//a[starts-with(@href,"item?")][substring(//a, string-length(//a) - string-length('comments')+1) = 'comments'],"c")