This is my lines of code for get content of website:
private string GetContent(string url) {
var request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
var content = String.Empty;
HttpStatusCode statusCode;
using (var response = request.GetResponse())
using (var stream = response.GetResponseStream())
{
var contentType = response.ContentType;
Encoding encoding = null;
if (contentType != null)
{
var match = Regex.Match(contentType, @"(?<=charset\=).*");
if (match.Success)
encoding = Encoding.GetEncoding(match.ToString());
}
encoding = encoding ?? Encoding.UTF8;
statusCode = ((HttpWebResponse)response).StatusCode;
using (var reader = new StreamReader(stream, encoding))
content = reader.ReadToEnd();
}
return content;
}
I have tried to run this lines of code with link: http://google.com. And It's done. But when I runs with link: http://batdongsan.com.vn/. It doesn't work and display "sorry! something went wrong.". And I don't know why what happened with it. How I can get content of second link?
Looks like the site is checking the User-Agent header and since it's not set by default it's returning an error message. I added what my browser sent and was able to get the contents of that link. Just add the line that sets the UserAgent as shown below:
// ...
var request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36";
var content = String.Empty;
HttpStatusCode statusCode;
// ...