I'm preparing a Pressentation on XML and XSLT for my universities computing club, I'm no expert but I'm better than anyone else, and it's just a 1 hour primer. So for my slides I figured I would use a XML document. which I would then turn into a series of webpages with XSLT 2.0
So we have my source document
<slideshow>
<slide title="Example">
<para>Below is an example of an XML document</para>
<code> <![CDATA[
<?xml version="1.0"?>
<elephant Name="Fido">
<head>
<eyes qty="2" colour="blue"/>
<trunk/>
<ears qty="2"/>
</head>
<body>
Thin, ribs showning
</body>
<legs qty="4">
Roughly 1.5m Long
</legs>
</elephant> ]]>
</code>
</slide>
</slideshow>
Since I don't want my examples to form part of the XML structure of the document, (and don't want them to be changed by the XSLT)
I have the CDATA section.
So, everytime I have a <code>...</code>
element, it is always written:
<code><![CDATA[...]]></code>
This is duplication, of information.
Is it possible for me to declare that every code
element only contains (Unparsed) character data?
so i would just write <code>...</code>
, and it would never try to parse what was inside.
seems like something that could be done with a DTD, perhaps?
Use
<!DECLARE-CDATA-ELEMENT code>
which would make code elements, not have their content parsed .
You could simlate it with entitites:
<!ENTITY CodeStart "<code><![CDATA[">
<!ENTITY CodeEnd "]]></code>">
then use: &CodeStart;<don'tParse/>&CodeEmd;
No, it can't be done, but you can Enforce that all code sections contain no child elements by ...
No, but you could do preproccessing like this: ...
These aren't answers they just indicate what answers could be like (hopefully this is now more clear)
Since I don't want my examples to form part of the XML structure of the document, (and don't want them to be changed by the XSLT) I have the CDATA section.
It isn't necessary to use CDATA section in order to protect some XML fragment from being "changed by the XSLT" -- simply write your XSLT code in such a way that it copies any subtree rooted at a code
element.
Is it possible forme to declare that every code element only contains (Unparsed) character data?
Yes, but XSLT doesn't require that there be a DTD for either the source XML document or for the result of the processing (Schema-aware XSLT 2.0 can validate these and even intermediate results, but it works only with XML Schema (XSD) ), and in case there is such DTD, XSLT doesn't use any type information (with the only exception being the fact that there is an ID attribute). Therefore, providing such a DTD isn't going to be helpful.
Also, such a DTD will be violated, unless you escape at least every &
and <
character in the child text-node of code
:
From the W3C XML specification:
"[Definition: All text that is not markup constitutes the character data of the document.] The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings " & " and " < " respectively."