Search code examples
c#htmlhtml-agility-pack

HTMLAgilityPack - Finding the innermost table


Im unsure on how to add dynamic Xpath to enable us to find the innermost tables (as an IEnumerable/List) in HTML regardless of what level theyre at

Essentially if I had:

    <table>
       <tr>
         <td>
           <table>
              <tr>
                 <td>
                     <table><tr><td>thisguy</td></tr></table>
                 </td>
              </tr>
                 <td>
                     <table><tr><td>thisguy</td></tr></table>
                 </td>
              </tr>
           </table>
        </td>
       </tr>
    </table>

Im trying to return the tables with td containing thisguy. Of course this is just an example. The real tables dont contain this.

I tried a recursive function but ended with:

private static IEnumerable<HtmlNode> GetBottomMostTable(HtmlNode nodeToCheck)
    {
        var isTableExist = nodeToCheck
                    .Descendants("table")
                    .Any();
        if (isTableExist)
        {
            var bottomMost = GetBottomMostTable(nodeToCheck.ChildNodes.Descendants("table").First());
        }else
        {
            return nodeToCheck
        }
    }

Solution

  • Try this code:

    var innerTables = doc.DocumentNode.SelectNodes("//table[not(descendant::table)]");
    

    The XPath used there will get all the table nodes that don't have a table as a descendant.