so when I'm trying to do is parse a HTML document using Html Agility Pack. I load the html doc and it works. The issue lies when I try to parse it using XPath. I get a "System.NullReferenceException: 'Object reference not set to an instance of an object.'" Error.
To get my xpath I use the Chrome Development window and highlight the whole table that has the rows which contains the data that I want to parse, right click it and copy Xpath.
Here's my code
string url = "https://www.ctbiglist.com/index.asp";
string myPara = "LastName=Smith&FirstName=James&PropertyID=&Submit=Search+Properties";
string htmlResult;
// Get the raw HTML from the website
using (WebClient client = new WebClient())
{
client.Headers[HttpRequestHeader.ContentType] = "application/x-www-form-urlencoded";
// Send in the link along with the FirstName, LastName, and Submit POST request
htmlResult = client.UploadString(url, myPara);
//Console.WriteLine(htmlResult);
}
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(htmlResult);
HtmlNodeCollection table = doc.DocumentNode.SelectNodes("//*[@id=\"Table2\"]/tbody/tr[2]/td/table/tbody/tr/td/div[2]/table/tbody/tr[2]/td/table/tbody/tr[2]/td/form/div/table[1]/tbody/tr");
Console.WriteLine(table.Count);
When I run this code it works but grabs all the tables in the HTML document.
var query = from table in doc.DocumentNode.SelectNodes("//table").Cast<HtmlNode>()
from row in table.SelectNodes("//tr").Cast<HtmlNode>()
from cell in row.SelectNodes("//th|td").Cast<HtmlNode>()
select new { Table = table.Id, CellText = cell.InnerText };
foreach (var cell in query)
{
Console.WriteLine("{0}: {1}", cell.Table, cell.CellText);
}
What I want is a specific table that holds all the tables rows that has the data I want to parse into objects.
Thanks for the help!!!
Change the line
from table in doc.DocumentNode.SelectNodes("//table").Cast<HtmlNode>()
to
from table in doc.DocumentNode.SelectNodes("//table[@id=\"Table2\"]").Cast<HtmlNode()
This will only select specific table with given Id. But if you have nested Tables then you have change your xpath accordingly to get the nested table rows.