Search code examples
pythonlxmlpretty-print

Why does Python 3.4.3 using lxml not output print statement correctly?


While I'm trying to run through an lxml tutorial, I couldn't help but wonder why when I use the print command, the output to the screen keeps wanting to put everything on one line even with pretty_print=True.

So say I just installed Python 3.4.3 64-bit and installed lxml-3.4.0.win32-py3.4.exe after Python was installed.

Then, in IDLE or at the python.exe cmd prompt, I do the following:

from lxml import etree
root = etree.XML('<root><a><b/></a></root>')
print(etree.tostring(root, pretty_print=True))

What I (and the tutorial) expected was the following output to the screen:

<root>
  <a>
    <b/>
  </a>
</root>

But what I actually see in both IDLE and the python cmd prompt in the Windows 7 is this:

b'<root>\n  <a>\n    <b/>\n  </a>\n</root>\n'

So why does the interpreter do this? Is there a way to toggle between single-line mode and the more normal standard output? And perhaps more importantly, if I want to write this XML to a file, will Python with the lxml insist on putting the \n and everything else on a line instead of pretty-printing this the way it's supposed to?

Thanks, Johnny


Solution

  • What you see is the representation of the bytes string. You can write bytes directly to a file:

    with open("file.xml", "wb") as output:
        output.write(etree.tostring(root, pretty_print=True))
    

    print expects an unicode string, so you have to encode to unicode:

    print(etree.tostring(root, pretty_print=True, encoding='unicode'))