I want to extract fields/text in a WebBrowser activity in Windows form using HTML agility pack. I'm able to scrape text in the background but want to do it in the WebBrowser inside my form.
I tried referencing my HtmlDocument variable to WebBrowser1.Document but it seems I cannot convert it.
This is the error I'm encountering
And these are the variable type
Here's my code.
Imports System
Imports System.Xml
Imports HtmlAgilityPack
Public Class Form1
Private Sub Form1_load(sender As System.Object, e As EventArgs) Handles MyBase.Load
WebBrowser1.Navigate(TextBox3.Text)
End Sub
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim link As String = TextBox3.Text
Dim doc As HtmlDocument = New HtmlWeb().Load(link)
Dim web_document As HtmlDocument = WebBrowser1.Document
Dim name As HtmlNode = doc.DocumentNode.SelectSingleNode("//*[@id='details']/div[2]/div[2]/div/div[1]/h3")
'if the div is found, print the inner text'
If Not name Is Nothing Then
TextBox1.Text = name.InnerText.Trim()
End If
Dim customer_number As HtmlNode = doc.DocumentNode.SelectSingleNode("//*[@id='details']/div[2]/div[2]/div/div[2]/dl[4]/dd")
'if the div is found, print the inner text'
If Not customer_number Is Nothing Then
TextBox2.Text = customer_number.InnerText.Trim()
End If
MessageBox.Show("Doc variable: " + doc.GetType.ToString + Environment.NewLine + "web_document variable: " + web_document.GetType.ToString)
End Sub
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
End Sub
End Class
The problem is WebBrowser1.Document
returns a Windows.Forms.HtmlDocument
, which is not the same as HtmlAgilityPack.HtmlDocument
.
If you want to use HtmlAgilityPack to scrape HTML from a web page in a WebBrowser
control, you need to get the DocumentText
from the browser control and load it into a new HtmlAgilityPack.HtmlDocument
instance like this:
Dim doc As New HtmlAgilityPack.HtmlDocument()
doc.LoadHtml(WebBrowser1.DocumentText)