I have been trying numerous methods over the last few days to extract data from a table:
The Link to the website.
This is one version of code I found online and adapted. I have tried many methods unsure if the Xpath is correct or where the issue is occurring:
private void button26_Click(object sender, EventArgs e)
{
//BCFERRIES 2
// URL of the website containing the table
string url = "https://www.bcferries.com/current-conditions/SWB-TSA/";
// Load the HTML content from the URL
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load(url);
//string tableXPath = "//table[@class='table-class']";
//string tableXPath = "//*[@id=\"tabs-1\"]/div[1]/table";
//string tableXPath ="/html/body/main/section[6]/div[1]/div/div[5]/div[1]/div[1]/table";
//string tableXPath = "//*[@id=\"tabs-1\"]";
//*[@id="tabs-1"]/div[1]/table/tbody
//string tableXPath = "//div[@id='tabs-1']/div[1]/table";
string tableXPath = "//div[@id='tabs']";
// Get the table from the HTML document
HtmlNode tableNode = doc.DocumentNode.SelectSingleNode(tableXPath);
//TEST
//HtmlNode firstChild = tableNode.FirstChild;
//HtmlNode firstChild = tableNode.LastChild;
//HtmlNode firstChild = tableNode.NextSibling;
//MessageBox.Show(firstChild.OuterHtml);
//MessageBox.Show(firstChild.InnerHtml);
// Check if the table exists
if (tableNode != null)
{
// Get all rows in the table
//var rows = tableNode.SelectNodes(".//tr");
var rows = tableNode.SelectNodes("./tr");
// Iterate through each row and display the data
foreach (var row in rows)
{
//var cells = row.SelectNodes(".//td");
var cells = row.SelectNodes("./td");
if (cells != null)
{
foreach (var cell in cells)
{
richTextBox1.AppendText(cell.InnerText.Trim() + "\t");
//MessageBox.Show(cell.InnerText.Trim());
}
richTextBox1.AppendText("\n");
//MessageBox.Show("");
}
}
}
else
{
MessageBox.Show("Table not found on the website.");
}
}
Each time I run the code, it either can't find the table, depending on the Xpath I use (I included many of my attempts with the Xpath), or if it finds the table it displays a blank messagebox when I attempt to see the first node, and then the programs fails trying to read the first row.
Any help would be appreciated....I am trying to see if I can read any of the time, boat or status fields before I build out the array or list for storing the data.
Thanks, Doug
The response from link via browser and code are different. So i tried to remove last slash from
string url = "https://www.bcferries.com/current-conditions/SWB-TSA/";
And received result with a table.