It is actually possible to specify that an element can contain both PCDATA and other elements. Such a content model is called mixed. To specify a mixed-content model, just list #PCDATA along with the child elements you want to allow:
<?xml version = "1.0" standalone="yes"?>
<!DOCTYPE DOCUMENT [
<!ELEMENT DOCUMENT (CUSTOMER)*>
<!ELEMENT CUSTOMER (NAME,DATE,ORDERS)>
<!ELEMENT NAME (LAST_NAME,FIRST_NAME)>
<!ELEMENT LAST_NAME (#PCDATA)>
<!ELEMENT FIRST_NAME (#PCDATA)>
<!ELEMENT DATE (#PCDATA)>
<!ELEMENT ORDERS (ITEM)*>
<!ELEMENT ITEM (PRODUCT, NUMBER, PRICE)>
<!ELEMENT PRODUCT (#PCDATA | PRODUCT_ID)*>
<!ELEMENT NUMBER (#PCDATA)>
<!ELEMENT PRICE (#PCDATA)>
<!ELEMENT PRODUCT_ID (#PCDATA)>
]>
<DOCUMENT>
<CUSTOMER>
<NAME>
<LAST_NAME>Weber</LAST_NAME>
<FIRST_NAME>Bill</FIRST_NAME>
</NAME>
<DATE>October 25, 2003</DATE>
<ORDERS>
<ITEM>
<PRODUCT>Asparagus</PRODUCT>
<NUMBER>12</NUMBER>
<PRICE>$2.95</PRICE>
</ITEM>
<ITEM>
<PRODUCT>Lettuce</PRODUCT>
<NUMBER>6</NUMBER>
<PRICE>$11.50</PRICE>
</ITEM>
</ORDERS>
</CUSTOMER>
</DOCUMENT>
I noticed when checking the correctness of the file by using the validators (.NET XML Parser, MSXML SAX, MSXML DOM, Java build-in), if #PCDATA
is on the top of the list - check passes. If before #PCDATA is a member - there are validation errors.
Why the mixed #PCDATA
element should be necessarily the first place?
Yes, what you are specifying here is what is call a mixed content, as defined in the w3C specification, §3.2.2. Mixed-content Declaration
[51] Mixed ::= '(' S? '#PCDATA' (S? '|' S? Name)* S? ')*'
And indeed the constraints for that are:
#PCDATA
must appear first;*
.So basically the reason why #PCDATA
must occur first is because the specification requires it.