I used this code to get the page info But now the site has changed and my application returns null error.
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(page);
var query = doc.DocumentNode
.SelectNodes("//table[@class='table table-striped table-hover']/tr")
.Select(r => {
return new DelegationLink()
{
Row = r.SelectSingleNode(".//td").InnerText,
Category = r.SelectSingleNode(".//td[2]").InnerText
};
}).ToList();
and this is my html:
<div role="tabpanel" class="tab-pane fade " id="tab3">
<div class="circular-div">
<table class="table table-striped table-hover" id="circular-table">
<thead>
<tr>
<th>ردیف</th>
<th>دسته بندی</th>
<th>عنوان</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>بخشنامهها</td>
<td>اطلاعیه جهاد دانشگاهی</td>
</tr>
<tr>
<td>2</td>
<td>بخشنامهها</td>
...
...
...
Where do I wrong?
Table rows are not direct descendants of the table but they are nested into other tags and that's why your code was returning null. Also you want to skip the header and scrape only the body of the table.
var query = doc.DocumentNode
.SelectNodes("//table[@class='table table-striped table-hover']/tbody/tr")
.Select(r =>
{
return new DelegationLink()
{
Row = r.InnerText,
Category = r.SelectSingleNode(".//td[2]").InnerText
};
}
).ToList();