I'm currently learning C# and its fun so far, but I have hit a roadblock.
I have a program that can scrape a webpage inside the web browser control for information.
So far I can get HTML
HtmlWindow window = webBrowser1.Document.Window;
string str = window.Document.Body.OuterHtml;
richTextBox1.Text = (str.ToString());
And Text
HtmlWindow window = webBrowser1.Document.Window;
string str = window.Document.Body.OuterText;
richTextBox1.Text = (str.ToString());
I have tried to scrape and display links like this
HtmlWindow window = webBrowser1.Document.Window;
string str = window.Document.Body.GetElementsByTagName("A").ToString();
richTextBox1.Text = str;
But instead, the Rich text box on the form populates with this
System.Windows.Forms.HtmlElementCollection
Do you know how I can get a list of links from the current webpage to show in the textbox?
Thanks Chris.
With the HtmlAgility pack it's easy:
HtmlWindow window = webBrowser1.Document.Window;
string str = window.Document.Body.OuterHtml;
HtmlAgilityPack.HtmlDocument HtmlDoc = new HtmlAgilityPack.HtmlDocument();
HtmlDoc.LoadHtml(str);
HtmlAgilityPack.HtmlNodeCollection Nodes = HtmlDoc.DocumentNode.SelectNodes("//a");
foreach (HtmlAgilityPack.HtmlNode Node in Nodes)
{
textBox1.Text += Node.OuterHtml + "\r\n";
}