Search code examples
c#httpwebrequest

Can't get content of website with c#


This is my lines of code for get content of website:

private string GetContent(string url) {
    var request = (HttpWebRequest)WebRequest.Create(url);
    request.Method = "GET";
    var content = String.Empty;
    HttpStatusCode statusCode;
    using (var response = request.GetResponse())
        using (var stream = response.GetResponseStream())
        {
            var contentType = response.ContentType;
            Encoding encoding = null;
            if (contentType != null)
            {
                var match = Regex.Match(contentType, @"(?<=charset\=).*");
                if (match.Success)
                    encoding = Encoding.GetEncoding(match.ToString());
            }

            encoding = encoding ?? Encoding.UTF8;

            statusCode = ((HttpWebResponse)response).StatusCode;
            using (var reader = new StreamReader(stream, encoding))
                content = reader.ReadToEnd();
        }
    return content;
}

I have tried to run this lines of code with link: http://google.com. And It's done. But when I runs with link: http://batdongsan.com.vn/. It doesn't work and display "sorry! something went wrong.". And I don't know why what happened with it. How I can get content of second link?


Solution

  • Looks like the site is checking the User-Agent header and since it's not set by default it's returning an error message. I added what my browser sent and was able to get the contents of that link. Just add the line that sets the UserAgent as shown below:

    // ...
    var request = (HttpWebRequest)WebRequest.Create(url);
    request.Method = "GET";
    request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36";
    
    var content = String.Empty;
    HttpStatusCode statusCode;
    // ...