Search code examples
elementdtdchildren

Writing a DTD: How to achieve this children setup


The element tasklist may contain at most one title and at most one description, additionally any number (incl. 0) task elements in any order.

The naive approach is not applicable, since the order should not matter:

<!ELEMENT tasklist (title?, description?, task*) >

Alternatively, I could explicitly name all possible options:

(title, description?, task*) |
(title, task+, description?, task*) |
(task+, title, task*, description?, task*) |
(description, title?, task*) |
(description, task+, title?, task*) |
(task+, description, task*, title?, task*) |
(task*)

but then it's quite easy to write a non-deterministic rule, and furthermore it looks like the direct path to darkest madness. Any ideas, how this could be done more elegantly?

And no, an XSD or RelaxNG is no option. I need a plain, old DTD.


Solution

  • This summarises what you need:

    <!ELEMENT tasklist (task*, ((title?, task*, description?) |
                        (description?, task*, title?)), task*)>
    

    Alternation for the title appearing before/after description.

    However, this is not a deterministic content model, as @13ren explains in his answer. [Here is another example from Microsoft](http://msdn.microsoft.com/en-us/library/9bf3997x(VS.71).aspx).

    In short

    Your requirements is to have a non-deterministic model, and as such, there is no possible valid DTD for your scenario.

    Alternatives

    If you place a simple restriction that either task or description must be the last element if both task and description are provided, you can use this deterministic DTD declaration:

    <!ELEMENT tasklist (
      task*,
      ((title, task*, description?) | 
      (description, task*, title?))?
    )>
    

    Examples:

    <!-- Valid -->
    <tasklist>
      <task></task>
      <task></task>
      <task></task>
      <title></title>
      <task></task>
      <description></description>
    </tasklist>
    <!-- Valid -->
    <tasklist>
      <title></title>
      <task></task>
      <task></task>
      <task></task>
    </tasklist>
    <!-- Invalid
    <tasklist>
      <task></task>
      <title></title>
      <task></task>
      <description></description>
      <task></task>
    </tasklist>
    -->
    

    Or, possibly more naturally, enforce that a title or description element must be the first element, and both title and description elements must exist or be non-existent.

    <!ELEMENT tasklist (
      ((title, task*, description) | 
      (description, task*, title))?,
      task*
    )>
    

    Examples:

    <!-- Valid -->
    <tasklist>
      <title></title>
      <task></task>
      <description></description>
      <task></task>
      <task></task>
    </tasklist>
    <!-- Invalid
    <tasklist>
      <task></task>
      <title></title>
      <description></description>
      <task></task>
      <task></task>
    </tasklist>
    
    <tasklist>
      <title></title>
      <task></task>
      <task></task>
      <task></task>
    </tasklist>
    -->
    

    Otherwise

    Otherwise, you need to use RELAX NG, which allows for non-deterministic models.