Search code examples
pythonxmlxml.etreeyattag

Replicating xml.etree example in yattag


I am trying to choose between using xml.etree and yattag. yattag seems to have a more concise syntax but I couldn't 100% replicate this xml.etree example:

from xml.etree.ElementTree import Element, SubElement, Comment, tostring

top = Element('top')

comment = Comment('Generated for PyMOTW')
top.append(comment)

child = SubElement(top, 'child')
child.text = 'This child contains text.'

child_with_tail = SubElement(top, 'child_with_tail')
child_with_tail.text = 'This child has regular text.'
child_with_tail.tail = 'And "tail" text.'

child_with_entity_ref = SubElement(top, 'child_with_entity_ref')
child_with_entity_ref.text = 'This & that'

print(tostring(top))

from xml.etree import ElementTree
from xml.dom import minidom

def prettify(elem):
    """Return a pretty-printed XML string for the Element.
    """
    rough_string = ElementTree.tostring(elem, 'utf-8')
    reparsed = minidom.parseString(rough_string)
    return reparsed.toprettyxml(indent="  ")

print(prettify(top))

which returns

<?xml version="1.0" ?>
<top>
  <!--Generated for PyMOTW-->
  <child>This child contains text.</child>
  <child_with_tail>This child has regular text.</child_with_tail>
  And &quot;tail&quot; text.
  <child_with_entity_ref>This &amp; that</child_with_entity_ref>
</top>

My attemp using yattag:

from yattag import Doc
from yattag import indent

doc, tag, text, line = Doc().ttl()

doc.asis('<?xml version="1.0" ?>')
with tag('top'):
    doc.asis('<!--Generated for PyMOTW-->')
    line('child', 'This child contains text.')
    line('child_with_tail', 'This child has regular text.')
    doc.asis('And "tail" text.')
    line('child_with_entity_ref','This & that')

result = indent(
    doc.getvalue(),
    indentation = '    ',
    newline = '\r\n',
    indent_text = True
)

print(result)

which returns:

<?xml version="1.0" ?>
<top>
    <!--Generated for PyMOTW-->
    <child>
        This child contains text.
    </child>
    <child_with_tail>
        This child has regular text.
    </child_with_tail>
    And "tail" text.
    <child_with_entity_ref>
        This &amp; that
    </child_with_entity_ref>
</top>

So the yattag code is shorter and simpler (I think), but I couldn't work out how to:

  1. Automatically add the XML version tag at the start (workaround is doc.asis)
  2. Create a comment (workaround is doc.asis)
  3. Escape the " character. xml.etree replaced it with &quot;
  4. Add tail text --- but I'm not sure why I'd need this.

My question is can I do the above 4 points better than I have done using yattag?

Note: I'm building XML to interact with this api.


Solution

  • For 1 et 2, doc.asis is the best way to proceed.

    For 3 you should use text('And "tail" text.') instead of using asis. This will escape the characters that need to be escaped. Note, though, that the " character is not actually escaped by the text method. This is normal. The " only need to be escaped if it appears inside a xml or html attribute, and you don't need to escape it inside a text node. The text method escapes the characters that need to be escaped inside a text node. These are the &, <, and > characters. (source: http://www.yattag.org/#the-text-method )

    I didn't understand 4.