Search code examples
xmljscriptwshxmldom

Getting proper HTML contents of a DIV (JavaScript WSH, XMLDOM ActiveX)


I'm having a lot of trouble figuring out how to select contents from specific HTML elements (which are in fact nodes) from an HTML file.

I'll admit first of all that this isn't "well-formed xml" but unless this really is my problem, I doubt it really matters. Take this file:

<html>
    <body id="464">
        <div id="fullname"> Use Cases  </div>
        <div id="intro"> <b><font color="#000033">Use Cases </font></b> </div>
    </body>
</html>

And this very barebone code I extracted from my full script:

xmlSourceTranslation = new ActiveXObject("Msxml2.DOMDocument.6.0");
xmlSourceTranslation.async="false";
xmlSourceTranslation.load(file.html);
xmlSourceTranslation = xmlSourceTranslation.documentElement;

var sourceNode = xmlSourceTranslation.selectSingleNode("//*[@id = 'fullname']");
if (typeof sourceNode === 'object') {
        sourceText = sourceNode.firstChild.nodeValue;
}

The problem is, depending if I get the fullname or intro div's, and the method I use (.firstChild.nodeValue , .innerHTML, .firstChild.innerHMTL, .childNodes), I'll either get a value of null, undefined, or I'll et an Object Required error when trying to access it. The only reliable method I can use is sourceNode.text, which works every time, but only gets "Use cases " as a value in the intro div, instead of the HTML which is what I need.

I have been hitting my head on my desk for almost 2 days trying to figure it out.


Solution

    1. .async needs a boolean (not a string) value
    2. .load needs a string
    3. clobbering the XML object with its documentElement is a bad idea
    4. using .load without an error check is a bad practice
    5. .selectSingleNode returns null on failure

    Minimal skeleton for XML work (adapted for your problem):

    var sFSpec = "..\\data\\24272956.xml";
    var sXPath = "//*[@id = 'fullname']";
    var oXml   = new ActiveXObject("Msxml2.DOMDocument.6.0");
    var sOtp   = "???";
    oXml.async = false;
    if (oXml.load(sFSpec)) {
       var ndDE  = oXml.documentElement;
       var ndSrc = ndDE.selectSingleNode(sXPath);
       if (ndSrc !== null) {
          sOtp = ndSrc.firstChild.nodeValue;
       } else {
          sOtp = "no node found for " + sXPath;
       }
    } else {
       sOtp = sFSpec + ": " + oXml.parseError.reason;
    }
    WScript.Echo(sOtp);
    

    output (good):

    cscript 24272956.js
     Use Cases
    

    sample outputs (bad):

    cscript 24272956.js
    ..\data\24272956.nosuchfile: The system cannot locate the object specified.
    
    cscript 24272956.js
    no node found for //*[@id = 'Fullname']
    
    cscript 24272956.js
    ..\data\24272956.xml: End tag 'Body' does not match the start tag 'body'.