Cascading logging in Python - Are there pitfalls i dont see?

I spent a lot of time in finding a setup for logging that works best for our projects. Our team is quite young and standards are establishing only now. We want to have well formatted logs that are taken care of in different handlers. That should be true if the whole application is started from main as well as a sub-package is run and debugged on its own. For that, we put together the following file structure:

project_folder
¦- main.py
¦- package_a
¦  ¦- module_1
¦  ¦- module_2
¦- logging_setup.py

With that, every file (main, module_1, etc.) has the following two lines before the rest of the import statements:

import logging, logging_setup
logger = logging_setup.setup("module_1") if not logging.getLogger().hasHandlers() else logging.getLogger()

This checks if its "parent" has set up a logger. If not, it creates its own according to logging_setup. As an example, it would create its own file with the name appendix "module_1" for recognition. Foldercreation for the file is also handled in logging_setup, so reusability should be given as long as logging_setup is present. And as every project needs some form of boiler plate code for logging, this could develop into a standard of our projects. This also works with pytest as it creates his own logger, which would be used the same way.

Are there pitfalls i am overseeing or does a way exist that makes it easier maintainable when projects really go up in scale?

Thank you for your insights!

Solution

Are there pitfalls i am overseeing or does a way exist that makes it easier maintainable when projects really go up in scale?

Yes, there are problems with this arrangement at any scale beyond "ad-hoc one-off script that won't be used frequently or maintained/developed by multiple people".

You do not define what "set up" means, but it sounds like you are talking about creating handlers and formatters, and attaching them to individual loggers.

It's normally not a good practice to "set up" formatted and handlers at the top level of every module in this sense.

The intended usage of the logging framework is so that logging a message is, by itself, cheap and free of side effects. The logging system is intended to be a black box, into which libraries dump messages, deliberately unaware of where they will be routed or how they will be handled.

If you are writing a "library", that is as far as it should go! Log handling and formatting is meant to be controlled by the user that is importing your code. Their logging to setup might be very different from what you anticipate, and your imposition of a particular set of handlers and for matters might make it impossible for other people to use this library.

If you are writing an "application", you don't really need to worry about other users importing your modules, so you can get away with a set up like this. However, I would still suggest that it is not good practice. Among other reasons, it's usually a good idea to write your applications as if they were the composition of many libraries, and to avoid interleaving "application-level" logic throughout.

As you also seem to be discovering with your attempt to write some kind of fallback logic, performing application setup when a module is imported means that you have poor control over when that setup is performed. You will spend a lot of time writing and debugging fallback logic. This is a waste of your effort, and it's not necessary because there is a better way to do things.

There also seem to be a couple of misunderstandings about the logging system reflected in your design.

Logging is "hierarchical", in that logger a.b is considered a "parent" of a.b.c, and my default log messages are propagated upward to their parents*. This means that, if you configure a handler for a.b, then messages logged to a.b.c will also be handled by the handler attached to a.b. You should not need to attach a handler to both a.b and a.b.c; you will get duplicate log messages if you do!

Moreover, your log_setup function appears to obtain a logger that is scoped to the current module, but your "fallback" code obtains the root logger, the special logger with no name ('') that is considered the parent of all other loggers.

My recommendation is to only call logging.getLogger(__name__) inside each module, and to configure all required handlers and formatters only inside main.py. this is standard practice how it is, and hopefully this answer provides some insight into why that is so.