Search code examples
xmldtddoctype

XML DTD Entities - nested declarations


Is it possible to use nested XML DTD entity declarations? I've done some research, but couldn't find a satisfying solution to my requirements.

What I'm currently using:

<!DOCTYPE test [
<!ENTITY system "SystemA">
<!ENTITY path SYSTEM "file:./path/to/SystemA">
<!ENTITY config_file SYSTEM "file:./config/for/SystemA.config">
]>

Since "SystemA" is already declared in variable system, I'd prefer to use the declared variable system instead of having "SystemA" in all subsequent declarations.


Solution

  • If you just want to have a general entity expanding to a file path as a text string in your body content, you can do that right away:

    <!ENTITY system "SystemA">
    <!ENTITY path "&system;.config">
    

    However, if you want to fetch markup declarations from a .config file such that those become part of the DTD, then you need an external entity declaration (as you've already figured out). For using text substitutions in declarations (as opposed to content), you must use parameter entities rather than general entities. Parameter entities are declared using a percent-sign (note the space following the percent-sign in the declaration, whereas there must be none for parameter entity references;

    <!ENTITY % paramentity "SystemA">
    

    Now you could use %paramentity; for substitution into the string SystemA in markup declarations;

    <!ELEMENT x ..,>
    %paramentity;
    

    though the above will give errors since the string SystemA (or any other character data) isn't allowed to appear in the DTD/the document prolog.

    There's an additional catch, though: parameter entity references aren't expanded in identifier literals, so this does not work:

    <!ENTITY % myent  SYSTEM "%paramentity;/f.config">
    

    Instead what happens is that the string %paramentity;/f.config is taken verbatim as the replacement value for %myent;. To build up system identifiers from parameter entities, you should target the literal token as a whole:

    <!ENTITY % sysid "'filename'">
    <!ENTITY file SYSTEM %sysid;>
    

    This behaviour of XML is due to SGML (from which XML is derived as a subset). Specifically, SGML puts a number of restrictions where parameter entity expansion can occur in the document prolog (or, for SGML but not XML which doesn't have these) also in markup declarations in content (such as marked section declarations). SGML does this to prevent obfuscatory use of markup.