Search code examples
c#htmltagshtml-agility-pack

html agility section selecting


I need to get the HTML in between 2 section tags like this:

<section class="image-section">
//images here...
</section>

I am using HTML Agility to do this and though this would work:

HtmlNode sec = document.DocumentNode.SelectNodes("//*[@class='image-section'")

but this does not. How would I get the HTML snippet I want?


Solution

  • Assuming that we have following html:

    <!DOCTYPE html>
    <html>
       <body>
          <h1>Test</h1>
          <section class="image-section">
             <img src="image1.jpg">
             <img src="image2.jpg">
          </section>
       </body>
    </html>
    

    Here is a code:

    class Program
        {
            static void Main(string[] args)
            {
                var html = File.ReadAllText(@"d:/my.html");
    
                var htmlDoc = new HtmlDocument();
                htmlDoc.LoadHtml(html);
    
                HtmlNodeCollection sections = htmlDoc.DocumentNode.SelectNodes("//*[@class='image-section']");
                var section = sections.FirstOrDefault();
                if (section != null)
                {
                    foreach (var imgElement in section.Elements("img"))
                    {
                        Console.WriteLine(imgElement.OuterHtml);
                    }
                }
    
                Console.ReadKey();
            }
        }
    

    Output:

    <img src="image1.jpg">
    <img src="image2.jpg">