Search code examples
c#seleniumwebdriver

C# - Get Rendered html Code Using With Selenium


I installed WebDriver of Selenium via NuGet package in a Console Application.
Selenium.WebDriver v3.141.0
Here is some sample code to request a web site with Internet Explorer, let it render and finally save the final HTML markup.

public class WebSiteHtmlLoader : IDisposable
{
    private readonly RemoteWebDriver _remoteWebDriver;

    public WebSiteHtmlLoader(RemoteWebDriver remoteWebDriver)
    {
        if (remoteWebDriver == null) throw new ArgumentNullException("remoteWebDriver");
        _remoteWebDriver = remoteWebDriver;
    }

    public string GetRenderedHtml(Uri webSiteUri)
    {
        if (webSiteUri == null) throw new ArgumentNullException("webSiteUri");
        _remoteWebDriver.Navigate().GoToUrl(webSiteUri);

        return _remoteWebDriver.PageSource;
    }

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    private void Dispose(bool disposing)
    {
        if (disposing)
        {
            if (_remoteWebDriver != null)
            {
                _remoteWebDriver.Quit();
            }
        }
    }
}

Usage:

class Program
{
    static void Main(string[] args)
    {
        if (!args.Any())
        {
            return;
        }

        var pageUrl = args.First();
        var options = new InternetExplorerOptions
        {
            IntroduceInstabilityByIgnoringProtectedModeSettings = true,
            PageLoadStrategy = InternetExplorerPageLoadStrategy.Eager
        };

        using (var htmlLoader = new WebSiteHtmlLoader(new InternetExplorerDriver(options)))
        {
            var html = htmlLoader.GetRenderedHtml(new Uri(pageUrl, UriKind.Absolute));
            File.WriteAllText(@"C:\htmlloadertext.html", html);
        }
    }
}

The problem is these codes are depricated.
I also have error like this :

The name 'InternetExplorerPageLoadStrategy' does not exist in the current context

What is updated & workable codes for chrome or FireFox?


Edit 1 :
When i remove this line :

PageLoadStrategy = InternetExplorerPageLoadStrategy.Eager

I got the error below :

An unhandled exception of type 'OpenQA.Selenium.DriverServiceNotFoundException' occurred in WebDriver.dll


Edit 2 :
I have error after change those codes to use chrome instead of IE.
Here is the codes :

  class Program
    {
        static void Main(string[] args)
        {
            var pageUrl = "https://mempool.space";
            var options = new ChromeOptions();
            //options.IntroduceInstabilityByIgnoringProtectedModeSettings = true;
            options.PageLoadStrategy = PageLoadStrategy.Eager;
            using (var htmlLoader = new WebSiteHtmlLoader(new ChromeDriver(options)))
            {
                var html = htmlLoader.GetRenderedHtml(new Uri(pageUrl, UriKind.Absolute));
                File.WriteAllText(@"C:\htmlloadertext.html", html);
            }
        }
    }

And here is the error :

An unhandled exception of type 'OpenQA.Selenium.DriverServiceNotFoundException' occurred in WebDriver.dll


Solution

  • Using your sample from above, I changed your Program.Main to

        static void Main(string[] args)
        {
            if (!args.Any())
            {
                return;
            }
    
            var pageUrl = args.First();
            var options = new ChromeOptions()
            {
                PageLoadStrategy = PageLoadStrategy.Eager
            };
    
            using (var htmlLoader = new WebSiteHtmlLoader(new ChromeDriver(options)))
            {
                var html = htmlLoader.GetRenderedHtml(new Uri(pageUrl, UriKind.Absolute));
                File.WriteAllText(@"C:\htmlloadertext.html", html);
            }
        }
    

    and it worked fine, the libs I am using are

      <PackageReference Include="Selenium.WebDriver" Version="3.141.0" />
      <PackageReference Include="Selenium.WebDriver.ChromeDriver" Version="92.0.4515.10700"/>
    

    and you need to have v92 of Chrome installed on your PC. I was able to extract https:\google.com to the specified file