Search code examples
xmlwell-formed

Why is this XML well formed?


The first time I see an XML like this:

<?xml version="1.0" encoding="UTF-8"?>
<test>
     <enclosingTag>Floating_Strange_Text <!--It's new to me-->
         <tag>data</tag>
         <tag>data</tag>
     </enclosingTag>
</test>

However, after testing in the well-formed XML validators like Markup Validation Service I'm suspicious that's really allowed.

Is this XML fragment really valid?


Solution

  • If some inner content is not wrapped into <> braces, it is stated as "text node". Moreover, you can have multiple sibling text nodes that could make your XML appear quite similar to HTML with all that bold/italic/underline tags mixed with the main text content. The only thing is, unlike HTML, the tags can not randomly overlap each other as XML requires to close tags exactly in reverse order of opening them.

    For instance, this is very legit XML:

    <root>The <truth /> is that there <is> no </is> spoon</root>
    

    And if we "dissect" it into nodes, the pretty-printed structure could be like this:

    <root>
        The 
        <truth />
        is that there 
        <is>
             no 
        </is>
        spoon
    </root>
    

    As you can see, there are 3 text nodes included into <root> tag and one more text node inside <is> tag.

    Indeed, unlike tag-based approach, this XML variation is not widely used due complicated validation schema etc. Same time, you should consider it as possible and correct XML stream.