Search code examples
html-agility-pack

Ignore some TR nodes


I have a HTML like

<body>
<tr class="sysinfoTableCategoryHeader">
    <td colspan="4">Operating System</td>
</tr>

    <tr class="sysinfoTablePropertyEven">
        <td />
        <td />
        <td><span class="sysinfoTablePropertyKey">Operating System Name</span></td>
        <td><span class="sysinfoTablePropertyValue">Linux</span></td>
    </tr>

    <tr class="sysinfoTablePropertyOdd">
        <td />
        <td />
        <td><span class="sysinfoTablePropertyKey">Kernel Version</span></td>
        <td><span class="sysinfoTablePropertyValue">4.8.0-1-amd64</span></td>
    </tr>

<tr class="sysinfoTableCategoryHeader">
    <td colspan="4">Motherboard</td>
</tr>

    <tr class="sysinfoTablePropertyEven">
        <td />
        <td />
        <td><span class="sysinfoTablePropertyKey">Manufacturer</span></td>
        <td><span class="sysinfoTablePropertyValue">Acer</span></td>
    </tr>

    <tr class="sysinfoTablePropertyOdd">
        <td />
        <td />
        <td><span class="sysinfoTablePropertyKey">Product</span></td>
        <td><span class="sysinfoTablePropertyValue">Aspire E5-531</span></td>
    </tr>
</body>

So I'm able to pick entire body from this html file which is actually awesome. But there is one problem . Lets say from that body i want to ignore the node with class name="sysinfoTableCategoryHeader" Operating system.

Is this doable at all ?

My output should be like this

<body>
<tr class="sysinfoTableCategoryHeader">
    <td colspan="4">Motherboard</td>
</tr>

    <tr class="sysinfoTablePropertyEven">
        <td />
        <td />
        <td><span class="sysinfoTablePropertyKey">Manufacturer</span></td>
        <td><span class="sysinfoTablePropertyValue">Acer</span></td>
    </tr>

    <tr class="sysinfoTablePropertyOdd">
        <td />
        <td />
        <td><span class="sysinfoTablePropertyKey">Product</span></td>
        <td><span class="sysinfoTablePropertyValue">Aspire E5-531</span></td>
    </tr>
</body>

How can i acoomplish it with HTMLAGILITYPACK ??


Solution

  • I'm english a little. exp code:

        HtmlDocument htmlDoc = new HtmlDocument(); 
    htmlDoc.LoadHtml(your html code); 
    HtmlNodeCollection htmlNodes = htmlDoc.DocumentNode.SelectNodes("//body/tr[@class!='sysinfoTableCategoryHeader']");
    

    the htmlNodes is you needs. Or use RemoveAllIDforNode();

        HtmlNodeCollection htmlNodes = htmlDoc.DocumentNode.SelectNodes("//body/tr[@class='sysinfoTableCategoryHeader']"); 
    
    foreach (HtmlNode node in htmlNodes) {
     htmlDoc.DocumentNode.RemoveAllIDforNode(node); 
    }