Search code examples
javascripthtmlfirefoxjavascript-frameworkmutation-observers

Determine whether element was added by JS vs original HTML doc *OR* detect when a script updates a node by InnerHtml


In short, I need to know if certain elements on the page are on the page because a some script inserted them via the InnerHtml property on a parent element, or whether they were part of the original HTML document that downloaded. These two possibilities mean very different things in this (absurd) application.

The Actual Use Case:

A 3rd party script updates random node elements on a page by setting the InnerHtml attribute of the element(s). I have full control over the browser (WPF / GeckoFx / XulRunner), and the ability to inject and modify (new) JS at will, but have no insight or ability to modify the heavily obfuscated 3rd party script whatsoever.

The ONLY way to get the data I need, is to determine after page load, whether certain elements on the screen, if they exist, were loaded by the third party script (innerHtml), or if they were part of the original Html document before the 3rd party script ran.


Simply comparing the original html content source of the page to its final state is difficult, because there is a lot of inline scripting on the original page.

Does anyone have any ideas?


Solution

  • Unfortunately, the suggestions to use mutation observers don't apply to this circumstance. Mutation observers are agnostic to the reason why a dom node was added to the page, they only report that one was. This means it is impossible to determine whether a piece of the DOM was added because the page is still loading, or because a script has fired and added content dynamically.

    HOWEVER

    This article explains how it is possible to overwrite the InnerHTML getter/setter properties of every element in the dom: http://msdn.microsoft.com/en-us/library/dd229916(v=vs.85).aspx Since InnerHTML is always called by javascript, it becomes trivial for me to know whether or not a certain part of the dom was loaded using this function call or not.

    While that is almost certainly overkill and not a good idea for most applications, for strange situations such as this, and the building of js frameworks, it likely makes good sense.

    In case that article goes offline at some point, my initial code looks similar to the following:

    var elem = isInIE() ? HTMLElement : Element;    // IE and FF have different inheritance models, behind the scenes.
    var proxiedInnerHTML = Object.getOwnPropertyDescriptor(elem.prototype, "innerHTML");
    
    Object.defineProperty(elem.prototype, "innerHTML", {
        set: function ( htmlContent )
        {
            // custom code goes here
    
            proxiedInnerHTML.set.call(this, htmlContent);
        }); 
    

    One should be warned in older browsers, or if you use the wrong element (HTMLElement vs Element), the call will fail on the innerHTML call, not on the property definition.

    Dealing with prototypes in browsers:

    I tested this block in FF and IE, but not in Chrome. More importantly, I found posts stating that there is no guarantee in the w3c spec that specifies how the browsers deal with inheritance of their element types, so there is no guarantee that HtmlDivElement will call the HtmlElement or Element base method for InnerHTML in future or past versions of any given browser.

    That said, it is pretty simple to create a webpage with all reserved html keywords, and test whether this technique works on them or not. For IE and FF, as of Jan 2015, this technique works across the board.

    Old Browser Support:

    Though I am not using it, in older browsers, you can use

    document.__defineGetter__("test", /* getter function */ );
    document.__defineSetter__("test", /* setter function */ );
    document.__lookupGetter__("test");
    document.__lookupSetter__("test");
    

    Thank you to RobG for sending me down this path