Search code examples
pythonxml

Python error "not well-formed (invalid token)"


I have some software that outputs an XML file that I am trying to read with python, so I can get the results and add them into my database.

import xml.etree.ElementTree as etree
with open('E:/uk_bets_history.xml', 'r') as xml_file:
    xml_tree = etree.parse(xml_file)

I am getting the error xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 1 but I am unsure why it is not formatted correctly.

I am not in control of how the file is created as this is done by some other software I own.

The example xml is here: http://jarrattperkins.com/uk_bets_history


Solution

  • File you've provided as example use UTF-8 with BOM encoding, so you need to use open() with encoding argument:

    open("FILE_PATH", encoding="utf-8-sig")