Search code examples
c#html-tablehtml-agility-pack

How to get a table inside a div, searching div by a certain id using htmlagilitypack


There is a html like

   <div id="info_tab_members">
        <div id="info_members" class="tabslevel">
            <ul>
                <li><a href="#info_tab_members_past">Past members</a></li>
                <li><a href="#info_tab_members_live">Live musicians</a></li> 
            </ul>

            <div id="info_tab_members_all">
                <div class="ui-tabs-panel">
                <!-- THIS TABLE I WANT -->
                <table class="display tblClass" cellpadding="0" cellspacing="0">....
                  <!-- DATA I WANT -->
                </table>
               </div>
            </div>                              

            <div id="info_tab_members_current">
                <div class="ui-tabs-panel">
                    <table class="display tblClass" cellpadding="0" cellspacing="0">        ...
                     </table>
                 </div>
            </div>      
        </div>
    </div>

How to get the table that is within div with id info_tab_members_all? something to consider is that there are several tables that have a common class display tblClass

I have tried:

first I tried to do

foreach (HtmlNode row in doc.DocumentNode.SelectNodes("table[@class='display tblClass']/tbody/tr"))
{
...
}     

but the issue is that I get data from all tables that have display tblClass so then I tried:

 var tbl = doc.DocumentNode.
                SelectSingleNode("//*[@id='info_tab_members_all']").
                SelectNodes("table[@class='display tblClass']/tbody/tr").
                ToList();

but I get error:

“Object reference not set to an instance of an object”

How can I specify the table i want with the div id: 'info_tab_members_all' ?


Solution

  • If you're able to use HtmlAgilityPack.CssSelectors, then you're in luck,

    var table = htmlDoc.QuerySelectorAll("#info_tab_members_all table");
    // table is `IList<HtmlNode>`
    

    If not, then you just need the right XPath. Here's a great reference for converting CSS to XPath and vice versa.

    var table = htmlDoc.DocumentNode.SelectSingleNode("//*[@id='info_tab_members_all']/*/table")
    // table is `HtmlNode`