Search code examples
c#web-scrapingdynamics-crmwebbrowser-control

System.Windows.Forms.WebBrowser wait until page has been fully loaded


I have been trying a lot of different solutions with wait and async. Nothing seems to work. I was not able to find solution that actually fully waits until page has been fully loaded. All codes are waiting some time but not until page has been loaded and I am getting an error on next process.

How I can set for example code into wait mode until Document.GetElementById("quickFind_text_0") element has been found on page?

Here is my code:

    private void button7_Click(object sender, EventArgs e)
    {

        webBrowser1.Navigate("https://company.crm4.dynamics.com/main.aspx?app=d365default&pagetype=entitylist&etn=opportunity");

        webBrowser1.Document.GetElementById("shell-container").Document.GetElementById("quickFind_text_0").SetAttribute("value", "Airbus");

        webBrowser1.Document.GetElementById("shell-container").Document.GetElementById("quickFind_text_0").InnerText = "Airbus";

        //Thread.Sleep(2000);

        HtmlElement fbLink = webBrowser1.Document.GetElementById("shell-container").Document.GetElementById("mainContent").Document.GetElementById("quickFind_button_0"); ;
        fbLink.InvokeMember("click");
    }

P.S. I have to do this "twice" otherwise it is not working:

    webBrowser1.Document.GetElementById("shell-container").Document.GetElementById("quickFind_text_0").SetAttribute("value", "Airbus");

    webBrowser1.Document.GetElementById("shell-container").Document.GetElementById("quickFind_text_0").InnerText = "Airbus";

In VBA this works:

    While .Busy
        DoEvents

    Wend
    While .ReadyState <> 4
        DoEvents
    Wend

Is it possible to do the same in C#?


EDIT:

My full code below. For some reason async/await does not work.

System.NullReferenceException HResult=0x80004003 Message=Object reference not set to an instance of an object. Source=v.0.0.01
StackTrace: at v._0._0._01.Browser.<button7_Click>d__7.MoveNext() in C:\Users\PC\source\repos\v.0.0.01\v.0.0.01\Browser.cs:line 69

Here is my code:

using System;
using System.Threading.Tasks;
using System.Windows.Forms;

namespace v.0._0._01
{

    public static class WebBrowserExtensions
    {
        public static Task<Uri> DocumentCompletedAsync(this WebBrowser wb)
        {
            var tcs = new TaskCompletionSource<Uri>();
            WebBrowserDocumentCompletedEventHandler handler = null;
            handler = (_, e) =>
            {
                wb.DocumentCompleted -= handler;
                tcs.TrySetResult(e.Url);
            };
            wb.DocumentCompleted += handler;
            return tcs.Task;
        }
    }

    public partial class Browser : Form
    {

        public Browser()
        {
            InitializeComponent();
        }

        private async void button7_Click(object sender, EventArgs e)
        {

            webBrowser1.Navigate("https://company.crm4.dynamics.com/main.aspx?app=d365default&pagetype=entitylist&etn=opportunity");
            await webBrowser1.DocumentCompletedAsync(); // async magic
            HtmlElement fbLink = webBrowser1.Document.GetElementById("shell-container").Document.GetElementById("mainContent").Document.GetElementById("quickFind_button_0"); ;
            fbLink.InvokeMember("click");

        }

    }

}

Also now I have noticed that quickFind_text_0 and quickFind_button_0 always starts with same words but numbers are changing like quickFind_text_1 and quickFind_button_1 or quickFind_text_2 and quickFind_button_2. However by manual clicking everything works with quickFind_text_0 and quickFind_button_0.


Solution

  • Here is an extension method for easy awaiting of the DocumentCompleted event:

    public static class WebBrowserExtensions
    {
        public static Task<Uri> DocumentCompletedAsync(this WebBrowser wb)
        {
            var tcs = new TaskCompletionSource<Uri>();
            WebBrowserDocumentCompletedEventHandler handler = null;
            handler = (_, e) =>
            {
                wb.DocumentCompleted -= handler;
                tcs.TrySetResult(e.Url);
            };
            wb.DocumentCompleted += handler;
            return tcs.Task;
        }
    }
    

    It can be used like this:

    private async void button1_Click(object sender, EventArgs e)
    {
    
        webBrowser1.Navigate("https://company.crm4.dynamics.com/main.aspx");
        await webBrowser1.DocumentCompletedAsync(); // async magic
        HtmlElement fbLink = webBrowser1.Document.GetElementById("quickFind_button_0");
        fbLink.InvokeMember("click");
    }
    

    The lines after the await will run after the page has completed loading.


    Update: Here is another extension method for awaiting a specific element to appear in the page:

    public static async Task<HtmlElement> WaitForElementAsync(this WebBrowser wb,
        string elementId, int timeout = 30000, int interval = 500)
    {
        var stopwatch = Stopwatch.StartNew();
        while (true)
        {
            try
            {
                var element = wb.Document.GetElementById(elementId);
                if (element != null) return element;
            }
            catch { }
            if (stopwatch.ElapsedMilliseconds > timeout) throw new TimeoutException();
            await Task.Delay(interval);
        }
    }
    

    It can be used for example after invoking a click event that modifies the page using XMLHttpRequest:

    someButton.InvokeMember("click");
    var mainContentElement = await webBrowser1.WaitForElementAsync("mainContent", 5000);