Search code examples
pythonxmlpyxb

PyXB XML Object to String


Given a PyXB object, how can one turn it into a string?

I used PyXB to generate an XML document, which I would like to then turn into a dictionary using the xmltodict module. The issue is that xmltodict.parse takes a bytes-like object, which the PyXB object of course, is not.


Solution

  • I found a method in the python d1_python library that accomplishes this. The method takes a PyXB object and will serialize it with the given encoding.

      def serialize_gen(obj_pyxb, encoding, pretty=False, strip_prolog=False):
      """Serialize a PyXB object to XML
      - If {pretty} is True, format for human readability.
      - If {strip_prolog} is True, remove any XML prolog (e.g., <?xml version="1.0"
      encoding="utf-8"?>), from the resulting string.
      """
      assert is_pyxb(obj_pyxb)
      assert encoding in (None, 'utf-8')
      try:
        if pretty:
          pretty_xml = obj_pyxb.toDOM().toprettyxml(indent='  ', encoding=encoding)
          # Remove empty lines in the result caused by a bug in toprettyxml()
          if encoding is None:
            pretty_xml = re.sub(r'^\s*$\n', r'', pretty_xml, flags=re.MULTILINE)
          else:
            pretty_xml = re.sub(b'^\s*$\n', b'', pretty_xml, flags=re.MULTILINE)
        else:
          pretty_xml = obj_pyxb.toxml(encoding)
        if strip_prolog:
          if encoding is None:
            pretty_xml = re.sub(r'^<\?(.*)\?>', r'', pretty_xml)
          else:
            pretty_xml = re.sub(b'^<\?(.*)\?>', b'', pretty_xml)
        return pretty_xml.strip()
      except pyxb.ValidationError as e:
        raise ValueError(
          'Unable to serialize PyXB to XML. error="{}"'.format(e.details())
        )
      except pyxb.PyXBException as e:
        raise ValueError(
          'Unable to serialize PyXB to XML. error="{}"'.format(str(e))
        )
    

    As an example, you can parse a PyXB object to UTF-8 with

    serialize_gen(pyxb_object, utf-8)

    To convert the object to a string, it would be called as

    serialize_gen(pyxb_object, None)