Search code examples
python-3.xxmlxmlserializer

A better XML Serializer for Python 3


I tried xml_marshaller as follows:

from xml_marshaller import xml_marshaller

class Person:
    firstName = "John"
    lastName = "Doe"

person1 = Person()
strXmlPerson = xml_marshaller.dumps(person1);
print(strXmlPerson)

The output from above is:

b'<marshal><object id="i2" module="__main__" class="Person"><tuple/><dictionary id="i3"><string>firstName</string><string>John</string><string>lastName</string><string>Doe</string></dictionary></object></marshal>'

which when formatted looks like this, which in my opinion is the ugliest XML possible:

b'<marshal>
    <object id="i2" module="__main__" class="Person">
        <tuple/>
        <dictionary id="i3">
            <string>firstName</string>
            <string>John</string>
            <string>lastName</string>
            <string>Doe</string>
        </dictionary>
    </object>
</marshal>'

What is the b and quotes doing there? Means "binary" maybe? Is that really part of the data, or just a side effect of printing it to the console?

Is there any Python 3 library that will create something more closer to "human" like this:

<Person> 
   <firstname>John</firstname>
   <lastname>Doe<lastname>
</Person> 

I'm looking for something close to what .NET creates (see http://mylifeismymessage.net/xml-serializerdeserializer/.

Please don't tell me try JSON or YAML, that's not the question. I might want to run the file through XSLT for example.

Update 2 days later:

I like the Peter Hoffman answer here: How can I convert XML into a Python object?

person1 = Person("John", "Doe")
#strXmlPerson = xml_marshaller.dumps(person1);
person = objectify.Element("Person")
strXmlPerson = lxml.etree.tostring(person1, pretty_print=True)
print(strXmlPerson)

gives error:

TypeError: Type 'Person' cannot be serialized.

In my scenario I might already have a class structure, and don't want to switch to they way they are doing it You can I serialize my "Person" class?


Solution

  • The reason the output is showing xml as a dictionary is most likely because the properties don't have a reference to the class. You should consider using self. and assigning values within a __init__ function.

    class Person:
        def __init__(self):
            self.firstName = "John"
            self.lastName = "Doe"
    

    There are many ways to convert an object to XML. However try using the package dicttoxml. As the name suggest you'll need to convert object to a dictionary, which can be done using vars().

    The full solution:

    from dicttoxml import dicttoxml
    
    class Person:
        def __init__(self):
            self.firstName = "John"
            self.lastName = "Doe"
    
    person = vars(Person()) # vars is pythonic way of converting to dictionary
    xml = dicttoxml(person, attr_type=False, custom_root='Person') # set root node to Person
    print(xml)
    

    Output:

    b'<?xml version="1.0" encoding="UTF-8" ?><Person><firstName>John</firstName><lastName>Doe</lastName></Person>'
    

    If you want to format the XML nicely, then you can use the built in xml.dom.minidom.parseString library.

    from dicttoxml import dicttoxml
    from xml.dom.minidom import parseString
    
    class Person:
        def __init__(self):
            self.firstName = "John"
            self.lastName = "Doe"
    
    person = vars(Person()) # vars is pythonic way of converting to dictionary
    xml = dicttoxml(person, attr_type=False, custom_root='Person') # set root node to Person
    print(xml)
    
    dom = parseString(xml)
    print(dom.toprettyxml())
    

    Output:

    <?xml version="1.0" ?>
    <Person>
       <firstName>John</firstName>
       <lastName>Doe</lastName>
    </Person>
    

    Do check out the documentation https://pypi.org/project/dicttoxml/ as you can pass additional arguments to alter the output.