Search code examples
pythonubuntulxmlubuntu-18.04arelle

Getting error about bad escape during start of Arelle


I am trying to get Arelle working on Ubuntu linux 18.04 with Python 3.6.9.

Step-1: (Download Arelle software):

git clone https://github.com/Arelle/Arelle.git -b lxml

Step-2 Install Python LXML:

apt-get install -y python-lxml

Step-3 Install Python tk:

Due to error: 'No module named tkinter'

...I install:

apt-get install python3-tk


When it's time to start Arelle from terminal, I use:

python3 arelleGUI.pyw

I then get following error:

Traceback (most recent call last):
  File "arelleGUI.pyw", line 9, in <module>
    from arelle import CntlrWinMain
  File "/tmp3/Arelle/arelle/CntlrWinMain.py", line 22, in <module>
    from arelle import Cntlr
  File "/tmp3/Arelle/arelle/Cntlr.py", line 8, in <module>
    from arelle import ModelManager
  File "/tmp3/Arelle/arelle/ModelManager.py", line 8, in <module>
    from arelle import (ModelXbrl, Validate, DisclosureSystem)
  File "/tmp3/Arelle/arelle/Validate.py", line 9, in <module>
    from arelle import (ModelXbrl, ModelVersReport, XbrlConst, ModelDocument,
  File "/tmp3/Arelle/arelle/ModelVersReport.py", line 9, in <module>
    from arelle import (XbrlConst, XbrlUtil, XmlUtil, UrlUtil, ModelXbrl, ModelDocument, ModelVersObject)
  File "/tmp3/Arelle/arelle/ModelDocument.py", line 9, in <module>
    from arelle import (XbrlConst, XmlUtil, UrlUtil, ValidateFilingText, XmlValidate)
  File "/tmp3/Arelle/arelle/ValidateFilingText.py", line 16, in <module>
    docCheckPattern = re.compile(r"&\w+;|[^0-9A-Za-z`~!@#$%&\*\(\)\.\-+ \[\]\{\}\|\\:;\"'<>,_?/=\t\n\r\m\f]") # won't match &#nnn;
  File "/usr/lib/python3.6/re.py", line 233, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python3.6/re.py", line 301, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/usr/lib/python3.6/sre_compile.py", line 562, in compile
    p = sre_parse.parse(p, flags)
  File "/usr/lib/python3.6/sre_parse.py", line 855, in parse
    p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
  File "/usr/lib/python3.6/sre_parse.py", line 416, in _parse_sub
    not nested and not items))
  File "/usr/lib/python3.6/sre_parse.py", line 527, in _parse
    code1 = _class_escape(source, this)
  File "/usr/lib/python3.6/sre_parse.py", line 336, in _class_escape
    raise source.error('bad escape %s' % escape, len(escape))
sre_constants.error: bad escape \m at position 67

I found this SO question that seems related to the issue.


Solution

  • This is an error in Arelle, which shows up for Python 3.6 and later. There is a pull request for it , but that is still open (since July 2017). Given that Python 3.6 has been out for quite a while, I don't know why this hasn't been fixed.

    You are using the lxml branch, which has been stale for 10 years. So perhaps this error has actually been fixed (even if the pull request is still open) on the master brach, but not on the lxml branch. Try installing from master first, if that is an option for you.