Search code examples
vb.nethttpwebrequest

vb.net httpwebrequest get html with google links


Imports System.Net
Imports System.IO

Public Class Form1
    Public Function GetHTML(ByVal url As Uri) As String
        Dim HTML As String

        Dim Request As HttpWebRequest
        Dim Response As HttpWebResponse
        Dim Reader As StreamReader

        Try
            Request = HttpWebRequest.Create(url)
            Response = Request.GetResponse
            Reader = New StreamReader(Response.GetResponseStream())

            HTML = Reader.ReadToEnd
        Catch ex As Exception
            HTML = Nothing
        End Try

        Return HTML
    End Function

    Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
        Dim url As Uri = New Uri(TextBox1.Text)

        TextBox2.Text = GetHTML(url)
    End Sub
End Class

Above is my code to get html from a webpage. I encountered a problem if I enter something like this http://www.google.com.sg/url?sa=t&rct=j&q=vb.net%20convert%20string%20to%20uri&source=web&cd=1&ved=0CFcQFjAA&url=http%3A%2F%2Fwww.vbforums.com%2Fshowthread.php%3Fp%3D3434187&ei=R0fxT872Cs2HrAesq4m-DQ&usg=AFQjCNGGedjegaM8osT689qWhbqpf6NI7Q

it gives me

   <script>window.googleJavaScriptRedirect=1</script>
    <script>
    var f={};
    f.navigateTo=function(b,a,g){
      if(b!=a&&b.google)
      {
        if(b.google.r)
         {
           b.google.r=0;
           b.location.href=g;
           a.location.replace("about:blank");
         }
      }
      else
      {
        a.location.replace(g);
      }
    };

    f.navigateTo(window.parent,window,"http://www.vbforums.com/showthread.php?p\x3d3434187");

    </script>
    <noscript>
    <META http-equiv="refresh" content="0;URL='http://www.vbforums.com/showthread.php?p=3434187'">
    </noscript>

and not the html of http://www.vbforums.com/showthread.php?p=3434187

how can I get my code to do the redirect and get the html?


Solution

  • Scrape the url out of the meta tag and then make a new request. For scraping I recommend HtmlAgilityPack, you can download it at http://html-agility-pack.net/ or install it with NuGet.