I'm trying to write an application to perform a search on a search engine automatically (search for a specified phone number on a search box) and get the resulted page as anything (XML, text). I've tried several web crawlers / scrapers but I haven't found something that performs the search and saves the resulted data. They can only get the requested page data which is not what I need.
The language can be C#, VB.Net or Java, it does not matter as long as it provides the solution. Thanks.
I'm looking for a tool or code snippet that does this.
A function like this will extract the data as a string, but you will have to parse the data in it yourself:
Public Shared Function GetWebPageData(ByVal address As Uri) As String
Dim request As HttpWebRequest
Try
request = DirectCast(WebRequest.Create(address), HttpWebRequest)
Using response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse)
Using reader As StreamReader = New StreamReader(response.GetResponseStream())
Return reader.ReadToEnd
End Using
End Using
Catch ex As Exception
'TODO handle the error here....
Return ""
End Try
End Function
Usage:
Dim xml As String = Networking.GetWebPageData(New Uri("http://www.hitta.se/077-570%2005%2000/f%C3%B6retag_och_personer"))
Debug.WriteLine(xml)