Search code examples
c#htmlweb-scrapingdatatable

Scrape Table from web page in c#


What is the best approach to build a function to scrape a html table on a webpage into a variable.

I want to be able to pass it some unique identifier (like table ID or something) and it will return all the data into something like a DataTable.


Solution

  • You can use HtmlAgilityPack to parse the HTML and extract the table data.

    With HAP now supporting Linq you could start with something like this:

    HtmlDocument doc = ...
    var myTable = doc.DocumentNode
                     .Descendants("table")
                     .Where(t =>t.Attributes["id"].Value == someTableId)
                     .FirstOrDefault();
    
    if(myTable != null)
    {
        ///further parsing here
    }