Search code examples
c#htmldatatablehtml-agility-pack

htmlagilitypack website datatable extract data


Yes, I have searched the web and stackoverflow. I am having trouble extracting data from a table from a website. I can retrieve the full table with code below, but need to extract select data:

Url = "https://www.multpl.com/shiller-pe/table/by-month";
web = new HtmlWeb();
doc = web.Load(Url);
pe = doc.DocumentNode.SelectSingleNode("//*[@id='datatable']").InnerText.ToString();
Console.Write(pe);

Xpath //*[@id='datatable']/tbody/tr[3]/td[2] for a data point does not work and throws error. This also does not work:

Url = "https://www.multpl.com/shiller-pe/table/by-month";
web = new HtmlWeb();
doc = web.Load(Url);
var table = doc.DocumentNode.SelectSingleNode("//*[@id='datatable']");
var tableRows = table.SelectNodes("tr");
var columns = tableRows[0].SelectNodes("th/text()");
for (int i = 1; i < tableRows.Count; i++)
{ 
for (int e = 0; e < columns.Count; e++)
    {
    var value = tableRows[i].SelectSingleNode("td[e + 1]");
    Console.Write(columns[e].InnerText + ":" + value.InnerText);
    }
}

Any direction will help, thank you.


Solution

  • Found a solution finally.

                Url = "https://www.multpl.com/shiller-pe/table/by-month";
                web = new HtmlWeb();
                doc = web.Load(Url);
                foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//td[@class='right']"))
                {
                  numberList.Add(Convert.ToDouble(node.InnerText));
                  //Print(node.InnerText.ToString());
    
                }
                Print(numberList[0]);