I have 2 lists:
public List<string> my_link = new List<string>();
public List<string> english_word = new List<string>();
I am scraping some links from a page and save them onto "my_link";for this I am using these codes like:
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load("http://search.freefind.com/find.html?id=59478474&pid=r&ics=1&query=" + x);
HtmlNodeCollection nodes=doc.DocumentNode.SelectNodes("//font[@class='search-results']//a");
try
{
foreach (HtmlNode n in nodes)
{
link = n.InnerHtml;
link = link.Trim();
my_link.Add(link);
}
}
catch (NullReferenceException )
{
MessageBox.Show("NO link found ");
}
Then i am scraping some content going on that links which I scrapped and I stored that content of each link on a english_word.Add(q);
It can scrape content from all links except the last one.my code is like that
foreach (string ss in my_link)
{
HtmlWeb web2 = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc2 = web2.Load(ss);
HtmlNodeCollection nodes2 = doc2.DocumentNode.SelectNodes("//table[@id='table1']//tr[position()>1]//td[position()=2]");
try
{
foreach (HtmlNode nn in nodes2)
{
q = nn.InnerText;
q = System.Net.WebUtility.HtmlDecode(q);
q = q.Trim();
english_word.Add(q);
}
}
catch (NullReferenceException ex)
{
MessageBox.Show("No english word is found ");
}
}
For last link only it shows "No english word is found "
What am I doing wrong?
First, catching a NullReferenceException
here is not a very good idea. It's better to check for null
where you're expecting nulls.
Second, most probably you get this exception because of HtmlNode.SelectNodes
method returns null
(not an empty collection of nodes, as you've been expected) if no nodes found. See HTML Agility Pack Null Reference, C#/ Html Agility pack error “Value cannot be null. Parameter name: Source.”, and a discussion on CodePlex.
So, instead of a try .. catch
block you could use something like:
if (nodes2 != null)
{
foreach (HtmlNode nn in nodes2)
{
q = nn.InnerText;
q = System.Net.WebUtility.HtmlDecode(q);
q = q.Trim();
english_word.Add(q);
}
}
else
{
MessageBox.Show("No english word is found ");
}