I'm using Pylint to check my code when I do commits. Recently, I've had a commit fail because of the following error:
UnicodeEncodeError: 'charmap' codec can't encode characters in position 1699-1713: character maps to <undefined>
Here's the traceback:
Traceback (most recent call last):
File "\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "\venv\tso_ingestion\Scripts\pylint.EXE\__main__.py", line 7, in <module>
File "\venv\tso_ingestion\lib\site-packages\pylint\__init__.py", line 36, in run_pylint
PylintRun(argv or sys.argv[1:])
File "\venv\tso_ingestion\lib\site-packages\pylint\lint\run.py", line 213, in __init__
linter.check(args)
File "\venv\tso_ingestion\lib\site-packages\pylint\lint\pylinter.py", line 701, in check
with self._astroid_module_checker() as check_astroid_module:
File "\AppData\Local\Programs\Python\Python310\lib\contextlib.py", line 142, in __exit__
next(self.gen)
File "\venv\tso_ingestion\lib\site-packages\pylint\lint\pylinter.py", line 1010, in _astroid_module_checker
checker.close()
File "\venv\tso_ingestion\lib\site-packages\pylint\checkers\similar.py", line 875, in close
self.add_message("R0801", args=(len(couples), "\n".join(msg)))
File "\venv\tso_ingestion\lib\site-packages\pylint\checkers\base_checker.py", line 164, in add_message
self.linter.add_message(
File "\venv\tso_ingestion\lib\site-packages\pylint\lint\pylinter.py", line 1323, in add_message
self._add_one_message(
File "\venv\tso_ingestion\lib\site-packages\pylint\lint\pylinter.py", line 1281, in _add_one_message
self.reporter.handle_message(
File "\venv\tso_ingestion\lib\site-packages\pylint\reporters\text.py", line 208, in handle_message
self.write_message(msg)
File "\venv\tso_ingestion\lib\site-packages\pylint\reporters\text.py", line 201, in write_message
self.writeln(self._fixed_template.format(**self_dict))
File "\venv\tso_ingestion\lib\site-packages\pylint\reporters\base_reporter.py", line 64, in writeln
print(string, file=self.out)
File "\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 1699-1713: character maps to <undefined>
The only changes I could see that could possible result in an encoding error was refactoring a set of asserts that looked like this:
assert e_info.type is PageException
assert e_info.value.args[0].url == url
assert e_info.value.args[0].body == soup.body
assert e_info.value.args[0].element == soup.body.find("form", id="form").find(
"a", string="料金通知情報一覧"
)
assert (
"Onclick event associated with \\'料金通知情報一覧\\' link was missing or malformed"
in str(e_info.value.args[0])
)
to look like this:
check_page_exception(
e_info,
url,
soup.body,
soup.body.find("form", id="form").find("a", string="料金通知情報一覧"),
"Onclick event associated with \\'料金通知情報一覧\\' link was missing or malformed",
)
There are Unicode characters in here but they've only moved around so I don't see how this could be causing the error. Does anyone know how to fix this?
Thanks to a comment from @KlausD., I was able to diagnose and fix the issue. Apparently, the problem was that my shell was set to the ANSI character set. While I've never had a problem with this before, Pylint was using the shell's default character set to print error messages. Although much of my code involves text written in Japanese and Pylint had certainly thrown errors before, none of those errors involved actually printing the text. The answer provided here fixed this issue.