Search code examples
javascriptstringdomparser

javascript DOMParser parsing document not the string


I've searched around the web and StackOverflow but didn't find anything quite like the problem I have.

I have the HTML string bellow:

var txtBoxForm = '<script src="http://ADDRESS"></script><noscript><a href="http://ADDRESS" target="_blank"><img src="http://ADDRESS" border=0 width=728 height=90></a></noscript>';

I am trying to parse it with:

parser = new DOMParser()
xmlDoc = parser.parseFromString(txtBoxForm, "text/xml");
alert(xmlDoc);
alert(xmlDoc.firstChild.nodeName);
alert(xmlDoc.firstChild.firstChild.nodeName);
alert(xmlDoc.firstChild.firstChild.firstChild.nodeName);
alert(xmlDoc.firstChild.firstChild.firstChild.firstChild.nodeName);

The problem is that even though the string begins with tag and there are no child nodes, I get the bellow returns from the alerts:

alert(xmlDoc);   ->   [Object document]
alert(xmlDoc.firstChild.nodeName);    ->    html
alert(xmlDoc.firstChild.firstChild.nodeName);    ->    body
alert(xmlDoc.firstChild.firstChild.firstChild.nodeName);    ->    parseerror
alert(xmlDoc.firstChild.firstChild.firstChild.firstChild.nodeName);   ->    h3

So my questions are:

  1. How come the parsed code does not begin with <script>, since the string does?
  2. Am I doing something wrong?
  3. How could I correctly parse that string code? My intention is to capture the src from the script and img tag.

Please help. Thanks.


Solution

  • It seems like you cannot pass a script tag to DOMParser plus there were a few other problems.

    • an XML doc must have a single root element (I wrapped your code with <doc></doc>)
    • scripts are not allowed (I changed it to <scripto>)
    • You must quote your attributes

    http://jsfiddle.net/mendesjuan/aVQaP/4/

    var txtBoxForm =
      '<doc>'+
        '<scripto src="http://ADDRESS"></scripto>'+
        '<noscript>' + 
          '<a href="http://ADDRESS" target="_blank">'+
            '<img src="http://ADDRESS" border="0" width="728" height="90" />'+
          '</a></noscript></doc>';
    
    var parser = new DOMParser();
    var xmlDoc = parser.parseFromString(txtBoxForm, "text/xml");
    
    // outputs http://ADDRESS
    console.log( xmlDoc.getElementsByTagName("scripto")[0].getAttribute("src") );
    // outputs http://ADDRESS
    console.log( xmlDoc.getElementsByTagName("img")[0].getAttribute("src") );​