Search code examples
xmlparsingsyntaxcontext-free-grammar

How to produce a sequence of parallel XML elements (STag content ETag) using its grammar?


I refer to this link for the following grammar,

[1]  document      ::=      prolog element Misc*
[39] element       ::=      STag content ETag
[43] content       ::=      CharData? ((element | Reference | CDSect | PI | Comment) CharData?)*

Obviously, we can produce elements (like, <p>hello world</p>) by decomposing

  • element to <p> content </p>, and then
  • content to hello world

But, what I am wondering is how to produce a sequence of parallel elements, like below,

<p>hello world</p>
<p>hello world</p>
<p>hello world</p>
<p>hello world</p>

It seems that we can only decompose the element in the grammar into nested elements, like below,

<p>
   <p>
       <p>hello world</p>
   </p>
</p>

From what I understand, in order to produce a sequence of parallel elements, we need to use a grammar like the following one,

document      ::=      prolog elements Misc*
elements      ::=      STag content ETag (STag content ETag)*
content       ::=      CharData? ((element | Reference | CDSect | PI | Comment) CharData?)*

So, did I miss anything?


Solution

  • The linked grammar says that:

    • a document must have a single top-level element, and
    • an element (via content) can contain zero or more (child) elements.

    So,

    <p>hello world</p>
    <p>hello world</p>
    

    isn't a well-formed document, but

    <something>
      <p>hello world</p>
      <p>hello world</p>
    </something>
    

    is a well-formed document.


    Your suggested grammar would allow

    <p>hello world</p>
    <p>hello world</p>
    

    as a document (well, not quite, because it doesn't allow the line-break between the two elements), but then you're not talking about XML documents any more.