Search code examples
pythonxmlpython-sphinxdocutils

Can I load docutils nodes directly as an XML file?


A docutils document is based on a hierarchy of nodes. They say throughout that the nodes are XML (or at least XML-like) and there are ways to dump documents and document fragments in XML format. They even produce a DTD describing an XML document format for nodes.

I want to generate docutils nodes using a non-Python tool, serialize them into a machine-readable format, then load them with docutils (as part of a Sphinx plugin).

I tried to find a way to load either a full XML document or an XML fragment (in the DTD-specified format) and get back a node tree, but I can't find anything. Is there any way to read the docutils XML format back into nodes, that is, the reverse of Node.asdom()?

More generally, is there any machine-readable format that I can load with docutils that describes a node tree? Either something based on XML or some other format?


Solution

  • The Docutils repository version 0.22b.dev introduces a "Docutils XML" parser that can parse an XML serialisation back into a Document Tree.

    So you may try it with, e.g.,

    rst2xml my_document.rst > my_document.xml
    
    # edit my_document.xml
    
    docutils --parser=xml --writer=html5 my_document.xml > my_document.html
    

    But it should also work with XML input generated by other means (or hand-crafted), as long as it is valid.

    You may also include a complete or partial document in a rST source employing the parser "include" directive option like

    .. include:: my_document.xml
       :parser: xml
    

    Caveat: As a Docutils module, the parser may/will fail on extensions to the Document Tree elements added by Sphinx.