Search code examples
c#webbrowser-controlinvokemember

Better approach in C# to search data on a third party web site


Here's my requirement. There is a public website which takes alphanumeric string as input and Retrieves data into a table element (via button click). The table element has couple of labels which gets populated with corresponding data. I need a tool/solution which can check if a particular string exists in the website's database. If so retrieve all the Ids of all the occurrences of that string. Looking at the "view source" of the website (No JavaScript used there), I noted the input element name and the button element name and with the help of existing samples I was able to get a working solution. Below is the code which works but I want to check if there is any better and faster approach. I know the below code has some issues like "infinite loop" issue and others. But I am basically looking at alternate solution which can work quickly for a million records.

    namespace SearchWebSite
    {
        public partial class Form1 : Form
        {
            bool searched = false;
            long i; 

            public Form1()
            {
                InitializeComponent();
            }

            private void button1_Click(object sender, EventArgs e)
            {
                i = 1;
                WebBrowser browser = new WebBrowser();
                string target = "http://www.SomePublicWebsite.com";
                browser.Navigate(target);
                browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(XYZ);
            }


            private void XYZ(object sender, WebBrowserDocumentCompletedEventArgs e)
            {
                WebBrowser b = null;
                if (searched == false)
                {
                    b = (WebBrowser)sender;
                    b.Document.GetElementById("txtId").InnerText = "M" + i.ToString();
                    b.Document.GetElementById("btnSearch").InvokeMember("click");
                    searched = true;
                }

                if (b.ReadyState == WebBrowserReadyState.Complete)
                {
                    if (b.Document.GetElementById("lblName") != null)
                    {
                        string IdNo = "M" + i.ToString();
                        string DateString = b.Document.GetElementById("lblDate").InnerHtml;
                        string NameString = b.Document.GetElementById("lblName").InnerHtml;

                        if (NameString != null && (NameString.Contains("XXXX") || NameString.Contains("xxxx")))
                        {
                            using (StreamWriter w = File.AppendText("log.txt"))
                            {
                                w.WriteLine("Id {0}, Date {1}, Name {2}", IdNo, DateString, NameString);
                                i = i + 1;
                                searched = false;
                            }
                        }
                        else
                        {
                            i = i + 1;
                            searched = false;
                        }
                    }
                    else
                    {
                        i = i + 1;
                        searched = false;
                    }
                }
            }
        }
    }

Solution

  • If the page after seach button clicked contains txtId and btnSearch controls than you can use this code snippet, this is not faster but the correct form I think.

    public partial class Form1 : Form
    {
        bool searched = false;
        long i = 1;
        private string IdNo { get { return "M" + i.ToString(); } }
        public Form1()
        {
            InitializeComponent(); 
        }
    
        private void button1_Click(object sender, EventArgs e)
        {
            i = 1;
            WebBrowser browser = new WebBrowser();
            string target = "http://www.SomePublicWebsite.com";
            browser.Navigate(target);
            browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(XYZ);
        }
        private void XYZ(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            WebBrowser b = (WebBrowser)sender;
            if (b.ReadyState == WebBrowserReadyState. Complete)
            {
                if (searched == false)
                {
                    DoSearch(b); return;
                }
                if (b.Document.GetElementById("lblName") != null)
                {
                    string DateString = b.Document.GetElementById("lblDate").InnerHtml;
                    string NameString = b.Document.GetElementById("lblName").InnerHtml;
    
                    if (NameString != null && (NameString.Contains("XXXX") || NameString.Contains("xxxx")))
                        using (StreamWriter w = File.AppendText("log.txt"))
                            w.WriteLine("Id {0}, Date {1}, Name {2}", IdNo, DateString, NameString);
                }
                i++;
                DoSearch(b);
            }
        }
        private void DoSearch(WebBrowser wb)
        {
            wb.Document.GetElementById("txtId").InnerText = IdNo;
            wb.Document.GetElementById("btnSearch").InvokeMember("click");
            searched = true;
        }
    }