I need to get the html code of a page at url address https://bakerhughesrigcount.gcs-web.com/intl-rig-count. I tried using HttpClient, but the request processing time is exceeded. Maybe this site has anti-bot protection? I tried adding User-Agent and Accept headers to the request to make it look more authentic and match a normal browser request, but it didn't work
string url = "https://bakerhughesrigcount.gcs-web.com/intl-rig-count";
using (HttpClient client = new HttpClient())
{
try
{
client.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36");
client.DefaultRequestHeaders.Add("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8");
HttpResponseMessage response = await client.GetAsync(url);
response.EnsureSuccessStatusCode();
string htmlContent = await response.Content.ReadAsStringAsync();
Console.WriteLine(htmlContent);
}
catch (HttpRequestException e)
{
Console.WriteLine($"Error: {e.Message}");
}
}
I also tried using Selenium, with its help I was able to get html code, but how to do it without using this and similar tools?
I would suggest you to pass these headers as well.
client.DefaultRequestHeaders.Add("Accept-Language", "en-US,en;q=0.5");
client.DefaultRequestHeaders.Add("Accept-Encoding", "deflate,br");
Here is the sample code which is working:
using System;
using System.Net.Http;
using System.Threading.Tasks;
public class Program
{
public static async Task Main()
{
string url = "https://bakerhughesrigcount.gcs-web.com/intl-rig-count/";
using (HttpClient client = new HttpClient())
{
try {
client.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36");
client.DefaultRequestHeaders.Add("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8");
client.DefaultRequestHeaders.Add("Accept-Language", "en-US,en;q=0.5");
client.DefaultRequestHeaders.Add("Accept-Encoding", "deflate,br");
var content = await client.GetStringAsync(url);
Console.WriteLine(content);
} catch (HttpRequestException e) {
Console.WriteLine($"Error: {e.Message}");
}
}
}
}
Screenshot: