Given the snippet of html and code bellow if you know part of the src e.g. 'FileName' how do you get the post ID of the parent div this could be higher up the dom tree and there could be 0, 1 or many src's with the same 'FileName'
I'm after "postId_19701770"
I've attempted to follow this page and this page I get Error CS1061 'HtmlNodeCollection' does not contain a definition for 'ParentNode'
namespace GetParent
{
class Program
{
static void Main(string[] args)
{
var html =
@"<body>
<div id='postId_19701770' class='b-post'>
<h1>This is <b>bold</b> heading</h1>
<p>This is <u>underlined</u> paragraph <div src='example.com/FileName_720p.mp4' </div></p>
</div>
</body>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
string keyword = "FileName";
var node = htmlDoc.DocumentNode.SelectNodes("//*[text()[contains(., '" + keyword + "')]]");
var parentNode = node.ParentNode;
Console.WriteLine(parentNode.Name);
Console.ReadLine();
}
}
}
Reason your code is not working is because you are looking up a ParentNode
of a collection of nodes. You need to select a single node and then look up its parent.
You can search all the nodes (collection) by src
as well that contains the data you are looking for. Once you have the collection, you can search each of those nodes to see which one you need or select the First()
one from that collection to get its Parent.
var html =
@"<body>
<div id='postId_19701770' class='b-post'>
<h1>This is <b>bold</b> heading</h1>
<p>This is <u>underlined</u> paragraph <div src='example.com/FileName_720p.mp4' </div></p>
</div>
</body>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
string keyword = "FileName";
var node = htmlDoc.DocumentNode.SelectNodes("//*[contains(@src, '" + keyword + "')]");
var parent = node.First().ParentNode; //node is a collection so get the first node for ex.
Console.WriteLine(parent.GetAttributeValue("id", string.Empty));
// Prints
postId_19701770
Instead of looking up "all" nodes, you can search specifically for 1 node via SelectSingleNode
method
var singleNode = htmlDoc.DocumentNode.SelectSingleNode(@"//*[contains(@src, '" + keyword + "')]");
Console.WriteLine(singleNode.ParentNode.GetAttributeValue("id", string.Empty));
// prints
postId_19701770