My python libxml2 differently processes the files with the default attributes, depending on what I want to know what. The example, using the DITA DTD (the package can be downloaded on www.dita-ot.org):
import libxml2
import libxsltmod
s = """<!DOCTYPE map PUBLIC "-//OASIS//DTD XDITA Map//EN"
"file://.../dita-ot-2.2.1/plugins/org.oasis-open.dita.v1
_2/dtd/technicalContent/dtd/map.dtd">
<map title="Empty map">
</map>"""
libxml2.substituteEntitiesDefault(1)
xmldoc = libxml2.parseDoc(s)
print xmldoc
The output is as desired:
<?xml version="1.0"?>
<!DOCTYPE map PUBLIC "-//OASIS//DTD XDITA Map//EN"
"file://.../dita-ot-2.2.1/plugins/org.oasis-open.dita.v1
_2/dtd/technicalContent/dtd/map.dtd">
<map xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/"
title="Empty map" ditaarch:DITAArchVersion="1.2" domains="(topic delay-d)
(map mapgroup-d) (topic indexing-d)
(map glossref-d) (topic hi-d)
(topic ut-d) (topic hazard-d)
(topic abbrev-d) (topic pr-d)
(topic sw-d) (topic ui-d)
" class="- map/map ">
</map>
But if I comment-out import libxsltmod
, the result is:
<?xml version="1.0"?>
<!DOCTYPE map PUBLIC "-//OASIS//DTD XDITA Map//EN"
"file://.../dita-ot-2.2.1/plugins/org.oasis-open.dita.v
1_2/dtd/technicalContent/dtd/map.dtd">
<map title="Empty map">
</map>
So, libxsltmod makes something to activate default attributes expansion. Would you please suggest what, and how I can activate this functionality through python?
I have no idea how libxsltmod enables this setting globally, but normally, DTD default attributes are added with the parser option XML_PARSE_DTDATTR
. Use readDoc
instead of parseDoc
to provide parser options:
xmldoc = libxml2.readDoc(s, None, None, libxml2.XML_PARSE_DTDATTR)
Or, if you also want to substitute entities:
flags = libxml2.XML_PARSE_NOENT | libxml2.XML_PARSE_DTDATTR
xmldoc = libxml2.readDoc(s, None, None, flags)