Search code examples
vb.netxelement

vb.net: Get data from Xelement <table></table>


I have loaded a Xelement from my SharePoint site.

Dim elTable2 As XElement = <table border="1" id="table2" style="font-size:1em;border-collapse:collapse;display:inline;width:100%">
<tbody>
    <tr class="ms-rteTableHeaderRow-default" style="text-align:center">
        <th class="ms-rteTableHeaderFirstCol-default">​</th>
        <th class="ms-rteTableHeaderOddCol-default">LAN IP​</th>
        <th class="ms-rteTableHeaderEvenCol-default">Username​</th>
        <th class="ms-rteTableHeaderOddCol-default">Password​</th>
        <th class="ms-rteTableHeaderEvenCol-default">Port​</th>
        <th class="ms-rteTableHeaderOddCol-default">OS​</th>
        <th class="ms-rteTableHeaderEvenCol-default">Extra Info​</th>
     </tr>
     <tr class="ms-rteTableOddRow-default" style="text-align:center">
        <th class="ms-rteTableFirstCol-default">Netasq</th>
        <td class="ms-rteTableOddCol-default"></td>
        <td class="ms-rteTableEvenCol-default"></td>
        <td class="ms-rteTableOddCol-default">​</td>
        <td class="ms-rteTableEvenCol-default">​</td>
        <td class="ms-rteTableOddCol-default">​</td>
        <td class="ms-rteTableEvenCol-default">​</td>
     </tr>
  </tbody>
</table>

The table will have an unknown number of rows. Can I loop trough all the rows and check what data is in the first column?


Solution

  • You could leverage Html Agility Pack for that purpose. It is a .NET code library that allows to parse HTML content and supports plain XPATH or XSLT.

    Example

    Dim table As XElement = <table border="1" id="table2" style="font-size:1em;border-collapse:collapse;display:inline;width:100%">
                                    <tbody>
                                        <tr class="ms-rteTableHeaderRow-default" style="text-align:center">
                                            <th class="ms-rteTableHeaderFirstCol-default">​Val</th>
                                            <th class="ms-rteTableHeaderOddCol-default">LAN IP​</th>
                                            <th class="ms-rteTableHeaderEvenCol-default">Username​</th>
                                            <th class="ms-rteTableHeaderOddCol-default">Password​</th>
                                            <th class="ms-rteTableHeaderEvenCol-default">Port​</th>
                                            <th class="ms-rteTableHeaderOddCol-default">OS​</th>
                                            <th class="ms-rteTableHeaderEvenCol-default">Extra Info​</th>
                                        </tr>
                                        <tr class="ms-rteTableOddRow-default" style="text-align:center">
                                            <td class="ms-rteTableFirstCol-default">Netasq</td>
                                            <td class="ms-rteTableOddCol-default"></td>
                                            <td class="ms-rteTableEvenCol-default"></td>
                                            <td class="ms-rteTableOddCol-default">​</td>
                                            <td class="ms-rteTableEvenCol-default">​</td>
                                            <td class="ms-rteTableOddCol-default">​</td>
                                            <td class="ms-rteTableEvenCol-default">​</td>
                                        </tr>
                                    </tbody>
                                </table>
    
    
        Dim html = New HtmlDocument()
        html.LoadHtml(table.ToString())
        For Each row In html.DocumentNode.SelectNodes("//tr[position()>1]")  'XPath expression to select table rows and skip table header
    
            Dim cell As HtmlNode = row.SelectSingleNode("td[1]")  'Get cell for first column
            Console.WriteLine(cell.InnerText)
        Next