Search code examples
ajaxjavascriptxmlhttprequestselenium-webdriver

Tracking with Java Script if Ajax request is going on in a webpage or Intercept XMLHttpRequest through Selenium Web driver


I am using Selenium WebDriver for crawling a web site(only for example, I will be crawling other web sites too!) which has infinite scroll.

Problem statement:

Scroll down the infinite scroll page till the content stops loading using Selenium web driver.

My Approach: Currently I am doing this-

Step 1: Scroll to the page bottom

JavascriptExecutor js = (JavascriptExecutor) driver;
js.executeScript("javascript:window.onload=toBottom();"+
                        "function toBottom(){" +
                        "window.scrollTo(0,Math.max(document.documentElement.scrollHeight," +
                        "document.body.scrollHeight,document.documentElement.clientHeight));" +
                "}");

Then I wait for some time to let the Ajax Request complete like this-

Step 2: Explicitly wait for Ajax request to be over

Thread.sleep(1000);

Then I give another java script to check if the page is scrollable

Step 3:Check if the page is scrollable

//Alternative to document.height is to be used which is document.body.clientHeight
//refer to https://developer.mozilla.org/en-US/docs/DOM/document.height

    if((Long)js.executeScript("return " +
                                "(document.body.clientHeight-(window.pageYOffset + window.innerHeight))")>0)

If the above condition is true then I repeat the from Step 1 - 3, till condition in Step 3 is false.

The Problem: I do not want to give the Thread.sleep(1000); in step 2, rather I would like to check using Java Script if the background Ajax request is over and then scroll down further if the condition in Step 3 is true .

PS: I am not the developer of the page so I do not have access to the code running the page, I can just inject java scripts(as in Step 1 and 3) in the web page. And, I have to write a generic logic for any web site with Ajax requests during infinite scroll.

I will be grateful to some one could spare some time here!

EDIT : Ok, after struggling for 2 days, I have figured out that the pages which I am crawling through the Selenium WebDriver can have any of these JavaScript libraries and I will have to pool according to the different Library, for example, In case of the web application using jQuery api, I may be waiting for

(Long)((JavascriptExecutor)driver).executeScript("return jQuery.active")

to return a zero.

Likewise if the web application is using the Prototype JavaScript library I will have to wait for

(Long)((JavascriptExecutor)driver).executeScript("return Ajax.activeRequestCount")

to return a zero.

Now, the problem is how do I write a generic code which could handle most the JavaScript libraries available?

Problem I am facing in implementing this-

1. How do I find which JavaScript Library is being used in the Web Application(using Selenium WebDriver in Java), such that I can then write the corresponding wait methods? Currently, I am using this

Code

2. This way I will have to write as many as 77 methods for separate JavaScript library so, I need a better way to handle this scenario as well.

In short, I need to figure out if the browser is making any call(Ajax or simple) with or without any JavaScript library through Selenium Web Driver's java implementation

PS: there are Add ons for Chorme's JavaScript Lib detector and Firefox's JavaScript Library detector which detect the JavaScript library being used.


Solution

  • For web pages with Ajax Response during the infinite scroll and using jQuery API(or other actions), before starting to opening the web page.

        //Inject the pooling status variable
        js.executeScript("window.status = 'fail';");
    
        //Attach the Ajax call back method
        js.executeScript( "$(document).ajaxComplete(function() {" +
        "status = 'success';});");
    

    Step 1: will remain the same as in the original question

    Step 2 Pooling the following script(This is the one which removes the need of Thread.Sleep() and makes the logic more dynamic)

    String aStatus = (String)js.executeScript("return status;");
    
                            if(aStatus!=null && aStatus.equalsIgnoreCase("success")){
                                js.executeScript("status = 'fail';");
                                break poolingLoop;
                            }
    

    Step 3: No need now!

    Conclusion: No need to give blunt Thread.sleep(); again and again while using Selenium WebDriver!!

    This approach works good only if there's jQuery api being used in the web application.

    EDIT: As per the the link given by @jayati i injected the javascript-

    Javascript one:

    //XMLHttpRequest instrumentation/wrapping
    var startTracing = function (onnew) {
        var OldXHR = window.XMLHttpRequest;
    
        // create a wrapper object that has the same interfaces as a regular XMLHttpRequest object
        // see http://www.xulplanet.com/references/objref/XMLHttpRequest.html for reference on XHR object
        var NewXHR = function() {
            var self = this;
            var actualXHR = new OldXHR();
    
            // private callbacks (for UI):
            // onopen, onsend, onsetrequestheader, onupdate, ...
            this.requestHeaders = "";
            this.requestBody = "";
    
            // emulate methods from regular XMLHttpRequest object
            this.open = function(a, b, c, d, e) { 
                self.openMethod = a.toUpperCase();
                self.openURL = b;
                ajaxRequestStarted = 'open';
    
                if (self.onopen != null && typeof(self.onopen) == "function") { 
                    self.onopen(a,b,c,d,e); } 
                return actualXHR.open(a,b,c,d,e); 
            }
            this.send = function(a) {
                ajaxRequestStarted = 'send';
    
                if (self.onsend != null && typeof(this.onsend) == "function") { 
                    self.onsend(a); } 
                self.requestBody += a;
                return actualXHR.send(a); 
            }
            this.setRequestHeader = function(a, b) {
                if (self.onsetrequestheader != null && typeof(self.onsetrequestheader) == "function") { self.onsetrequestheader(a, b); } 
                self.requestHeaders += a + ":" + b + "\r\n";
                return actualXHR.setRequestHeader(a, b); 
            }
            this.getRequestHeader = function() {
                return actualXHR.getRequestHeader(); 
            }
            this.getResponseHeader = function(a) { return actualXHR.getResponseHeader(a); }
            this.getAllResponseHeaders = function() { return actualXHR.getAllResponseHeaders(); }
            this.abort = function() { return actualXHR.abort(); }
            this.addEventListener = function(a, b, c) { return actualXHR.addEventListener(a, b, c); }
            this.dispatchEvent = function(e) { return actualXHR.dispatchEvent(e); }
            this.openRequest = function(a, b, c, d, e) { return actualXHR.openRequest(a, b, c, d, e); }
            this.overrideMimeType = function(e) { return actualXHR.overrideMimeType(e); }
            this.removeEventListener = function(a, b, c) { return actualXHR.removeEventListener(a, b, c); }
    
            // copy the values from actualXHR back onto self
            function copyState() {
                // copy properties back from the actual XHR to the wrapper
                try {
                    self.readyState = actualXHR.readyState;
                } catch (e) {}
                try {
                    self.status = actualXHR.status;
                } catch (e) {}
                try {
                    self.responseText = actualXHR.responseText;
                } catch (e) {}
                try {
                    self.statusText = actualXHR.statusText;
                } catch (e) {}
                try {
                    self.responseXML = actualXHR.responseXML;
                } catch (e) {}
            }
    
            // emulate callbacks from regular XMLHttpRequest object
            actualXHR.onreadystatechange = function() {
                copyState();
    
                try {
                    if (self.onupdate != null && typeof(self.onupdate) == "function") { self.onupdate(); } 
                } catch (e) {}
    
                // onreadystatechange callback            
                if (self.onreadystatechange != null && typeof(self.onreadystatechange) == "function") { return self.onreadystatechange(); } 
            }
            actualXHR.onerror = function(e) {
    
                ajaxRequestComplete = 'err';
                copyState();
    
                try {
                    if (self.onupdate != null && typeof(self.onupdate) == "function") { self.onupdate(); } 
                } catch (e) {}
    
                if (self.onerror != null && typeof(self.onerror) == "function") { 
                    return self.onerror(e); 
                } else if (self.onreadystatechange != null && typeof(self.onreadystatechange) == "function") { 
                    return self.onreadystatechange(); 
                }
            }
            actualXHR.onload = function(e) {
    
                ajaxRequestComplete = 'loaded';
                copyState();
    
                try {
                    if (self.onupdate != null && typeof(self.onupdate) == "function") { self.onupdate(); } 
                } catch (e) {}
    
                if (self.onload != null && typeof(self.onload) == "function") { 
                    return self.onload(e); 
                } else if (self.onreadystatechange != null && typeof(self.onreadystatechange) == "function") { 
                    return self.onreadystatechange(); 
                }
            }
            actualXHR.onprogress = function(e) {
                copyState();
    
                try {
                    if (self.onupdate != null && typeof(self.onupdate) == "function") { self.onupdate(); } 
                } catch (e) {}
    
                if (self.onprogress != null && typeof(self.onprogress) == "function") { 
                    return self.onprogress(e);
                } else if (self.onreadystatechange != null && typeof(self.onreadystatechange) == "function") { 
                    return self.onreadystatechange(); 
                }
            }
    
            if (onnew && typeof(onnew) == "function") { onnew(this); }
        }
    
        window.XMLHttpRequest = NewXHR;
    
    }
    window.ajaxRequestComplete = 'no';//Make as a global javascript variable
    window.ajaxRequestStarted = 'no';
    startTracing();
    

    Or Javascript Two:

    var startTracing = function (onnew) {
        window.ajaxRequestComplete = 'no';//Make as a global javascript variable
        window.ajaxRequestStarted = 'no';
    
        XMLHttpRequest.prototype.uniqueID = function() {
            if (!this.uniqueIDMemo) {
                this.uniqueIDMemo = Math.floor(Math.random() * 1000);
            }
            return this.uniqueIDMemo;
        }
    
        XMLHttpRequest.prototype.oldOpen = XMLHttpRequest.prototype.open;
    
        var newOpen = function(method, url, async, user, password) {
    
            ajaxRequestStarted = 'open';
            /*alert(ajaxRequestStarted);*/
            this.oldOpen(method, url, async, user, password);
        }
    
        XMLHttpRequest.prototype.open = newOpen;
    
        XMLHttpRequest.prototype.oldSend = XMLHttpRequest.prototype.send;
    
        var newSend = function(a) {
            var xhr = this;
    
            var onload = function() {
                ajaxRequestComplete = 'loaded';
                /*alert(ajaxRequestComplete);*/
            };
    
            var onerror = function( ) {
                ajaxRequestComplete = 'Err';
                /*alert(ajaxRequestComplete);*/
            };
    
            xhr.addEventListener("load", onload, false);
            xhr.addEventListener("error", onerror, false);
    
            xhr.oldSend(a);
        }
    
        XMLHttpRequest.prototype.send = newSend;
    }
    startTracing();
    

    And checking the status of the status vars ajaxRequestStarted, ajaxRequestComplete in the java code, one can determine if the ajax was started or completed.

    Now I have a way to wait till an Ajax is complete, I can also find if the Ajax was triggered on some action