Search code examples
pythonloggingencodingutf-8stdout

How to encode all logged messages as utf-8 in Python


I have a little logger function that returns potentially two handlers to log to a RotatingFileHandler and sys.stdout simultaneously.

import os, logging, sys
from logging.handlers import RotatingFileHandler
from config import *

def get_logger(filename, log_level_stdout=logging.WARNING, log_level_file=logging.INFO, echo=True):
    logger = logging.getLogger(__name__)
    if not os.path.exists(PATH + '/Logs'):
        os.mkdir(PATH + '/Logs')

    logger.setLevel(logging.DEBUG)

    if echo:
        prn_handler = logging.StreamHandler(sys.stdout)
        prn_handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s: %(message)s'))
        prn_handler.setLevel(log_level_stdout)
        logger.addHandler(prn_handler)

    file_handler = RotatingFileHandler(PATH + '/Logs/' + filename, maxBytes=1048576, backupCount=3)
    file_handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s: %(message)s'))
    file_handler.setLevel(log_level_file)
    logger.addHandler(file_handler)
    return logger

This works fine in general but certain strings being logged appear to be encoded in cp1252 and throw an (non-fatal) error when trying to print them to stdout via logger function. It should be noted that the very same characters can be printed just fine in the error message. Logging them to a file also causes no issues. It's only the console - sys.stdout - that throws this error.

--- Logging error ---
Traceback (most recent call last):
  File "C:\Program Files\Python38\lib\logging\__init__.py", line 1084, in emit
    stream.write(msg + self.terminator)
  File "C:\Program Files\Python38\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u1ecd' in position 65: character maps to <undefined>
Call stack:
  File "script.py", line 147, in <module>
    logger.info(f"F-String with a name in it: '{name}'.")
Message: "F-String with a name in it: 'Heimstọð'."
Arguments: ()

A fix to this has been to encode every single message getting as utf8 in the code that's calling the logger function like this:

logger.info((f"F-String with a name in it: '{name}'.").encode('utf8'))

However I feel like this is neither elegant nor efficient. It should also be noted that the logging of the file works just fine and I already tried setting the PYTHONIOENCODING to utf-8 in the system variables of Windows without any noticeable effect.

Update: Turns out I'm stupid. Just because an error message is printed in the console doesn't mean the printing to the console is the cause of the error. I was looking into the answers to the other question that has been suggested to me here and after a while realized that nothing I did to the "if echo" part of the function had any impact on the result. The last check was commenting out the whole block and I still got the error. That's when I realized that the issue was in fact caused by not enforcing UTF8 when writing to the file. Adding the simple kwarg encoding='utf-8' to the RotatingFileHandler as suggested by @michael-ruth fixed the issue for me. P.S. I'm not sure how to handle this case because, while that answer fixed my problem, it wasn't really what I was asking for or what the question suggested because I originally misunderstood the root cause. I'll still check it as solution and upvote both answers. I'll also edit the question as to not mislead future readers into believing it would answer that question when it doesn't really.


Solution

  • Set the encoding while instantiating the handler instead of encoding the message explicitly.

    file_handler = RotatingFileHandler(
        PATH + '/Logs/' + filename, 
        maxBytes=1048576, 
        backupCount=3,
        encoding='utf-8'
    )
    

    help(RotatingFileHandler) is your best friend.

    Help on class RotatingFileHandler in module logging.handlers:
    
    class RotatingFileHandler(BaseRotatingHandler)
     |  RotatingFileHandler(filename, mode='a', maxBytes=0, backupCount=0, encoding=None, delay=False)