Search code examples
xmlxsddtdxml-validationdocbook-5

How can both XML declaration and DTD be optional in the spec if they are both prerequisites to validity and well-formedness of an XML document?


Chewing my way through the latest XML 1.0 specification, and an XML document is defined as follows:

[1]     document       ::=      prolog element Misc*
...
[22]    prolog         ::=      XMLDecl? Misc* (doctypedecl Misc*)?
[23]    XMLDecl        ::=      '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
...

[28]    doctypedecl    ::=      '<!DOCTYPE' S Name (S ExternalID)? S? ('[' intSubset ']' S?)? '>'

The spec states that

  • [Definition: An XML document is valid if it has an associated document type declaration and if the document complies with the constraints expressed in it.]

  • and well-formed if "It meets all the well-formedness constraints given in this specification." (see definition).

The definition of document type declaration has two well-formedness constraints and one validity constraint so if it's omitted the XML document cannot be considered a valid.

There is a minimal XML document example in there,

<?xml version="1.0"?>
<greeting>Hello, world!</greeting>

and I understand why it is well-formed but not valid, but it still doesn't explain how the DTD can be optional if it is required for an XML document to be valid.


Background for this question

Started reading the XML spec because wanted to get a better understanding before getting into DocBook 5 but it's manual states that "DocBook V5.0 is thus defined using a powerful schema language called RELAX NG" so it "does not depend on DTDs anymore", and the example shown completely omits the DTD too.


Solution

  • The W3C XML Recommendation only defines one type of XML schema: DTD. Others exist: XSD, Relax NG, and Schematron are other XML schemas. In fact, DTD is rarely used to define modern XML schemas due to its limited expressiveness.

    The concept of validity has been extended to apply to all XML schemas: An XML document is said to be valid against an XML schema if it adheres to the grammar and content constraints defined by the schema.

    • A DTD can be omitted for the same reason that an XML document need not be associated with any XML schema: Adherence to the rules of well-formedness is often sufficient for applications.
    • An XML declaration can be omitted because its values' defaults are sufficient to support the well-formedness rules throughout the rest of the Recommendation.

    See also