Search code examples
xmldeclarationdtdw3c

What are external entities and notations in DTD?


I have been reading about these topics in W3C Recommendation and Wikipedia. I am not sure if I have fully understood them. Could someone explain clearly to me what external entities and notations are in DTD? What are their uses exactly?

Here are some examples of external entity declarations:

<!ENTITY open-hatch SYSTEM    
         "http://www.textuality.com/boilerplate/OpenHatch.xml">
<!ENTITY open-hatch PUBLIC 
         "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
         "http://www.textuality.com/boilerplate/OpenHatch.xml">
<!ENTITY hatch-pic SYSTEM 
         "../grafix/OpenHatch.gif"
         NDATA gif >

Correct me if I am wrong. A general, internal entity replaces the entity name (&ent;) in the document body with the string declared. Does an external entity replaces the entity name with the entire content of an external document?


Solution

  • Yes you are understanding correctly. An entity reference (like &open-hatch;) is a reference to whatever is defined in the ENTITY declaration.

    Notations are used to specify non-XML (unparsed) data. In the example above the ENTITY declaration specifies that the content of OpenHatch.gif is the notation gif. There would also need to be a corresponding NOTATION declaration for gif. This can be used by an XML processor or application to find another application that can process the data for that notation.

    Also, entities don't always need to be referenced by the usual &entity-name; method. Attribute can also be specified as the type ENTITY. You see this a lot with notational data (NDATA) like graphics.

    For example...

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE doc [
    <!ELEMENT doc (graphic)>
    <!ELEMENT graphic EMPTY>
    <!ATTLIST graphic
              src ENTITY #REQUIRED>
    <!NOTATION cgm PUBLIC "-//USA-DOD//NOTATION Computer Graphics Metafile//EN">
    <!ENTITY test-image SYSTEM "cgm/test-image.cgm" NDATA cgm>
    ]>
    <doc>
        <graphic src="test-image"/>
    </doc>
    

    In the above example, I have an ENTITY named test-image. That entity is a reference to the file cgm/test-image.cgm which is the notation cgm. The entity is referenced by the src attribute of the graphic element. How all of this information is used depends on what application is consuming the data.