Search code examples
xmldeclarationdtdw3c

What does % mean in DTD?


I am reading the W3C Recommendation on XML. I came across this DTD example:

<!ELEMENT %name.para; %content.para; >

What does this mean? What kind of XML will satisfy the declaration?


Solution

  • I am reading the W3C Recommendation on XML. I came across this DTD example:

        <!ELEMENT %name.para; %content.para; >
    

    What does this mean? What kind of XML will satisfy the declaration?

    The tokens beginning with % in the declaration you quote (%name.para; and %content.para;) are references to parameter entities. Parameter entities are described in section 4 of the XML spec; they resemble general entities (which use & not % as their opening delimiter), but general entities are used in the document body, while parameter entities are used inside the DTD.

    To be correct in context, this declaration requires that earlier in the DTD there have been declarations for the parameter entities name.para and content.para. The kind of XML that satisfies the declaration depends on how those entities were declared.

    The relevant declarations might be as follows, for example:

    <!ENTITY % name.para "p">
    <!ENTITY % content.para "(#PCDATA | emph | name | date)*">
    

    In this case, the declaration you quote will have the same effect as the following declaration (which I get by replacing the parameter-entity references with the replacement text of the relevant parameter entities):

    <!ELEMENT p (#PCDATA | emph | name | date)* >
    

    The indirection offered by the parameter entities allows easy customization of the vocabulary. The caller of the DTD can override the declarations of the two parameter entities with alternative declarations to change the name of the element from p to para and to add a partnum element to the content model, like these:

    <!ENTITY % content.para "(#PCDATA | emph | name | date
       | partnum)*">
    <!ENTITY % name.para "para">
    

    Now the effect of the element declaration is to declare an element named para:

    <!ELEMENT para (#PCDATA | emph | name | date | partnum)* >
    

    (I've normalized whitespace here for clarity.)

    The following XML is valid against the first set of declarations, but not the second:

    <p>Hello, world!</p>
    

    The following is valid against the second but not the first:

    <para>You will need the replacement vacillator (part number
      <partnum>Q34-5332</partnum>) and a screwdriver.</para>
    

    One common use of parameter entities is to provide hooks for customization of DTDs, as shown here. Some DTDs allow elements to be renamed, many provide a system of element classes which allows extension elements to be added by overriding a parameter-entity declaration instead of changing every relevant content model.