Search code examples
pythongraphqlcelerypython-logging

Celery diable gql logging


I am using celery in a slightly complex application. One of my tasks queries large arrays from a GraphQL backend. I want to use the celery logger, but when using the logger with loglever <= Info I see the whole request and answer from the gql client in the logs.

This is really annoying since now all the important information is just buried somewhere in the logs. I haven't worked with Python logging a lot but I already tried a few things to disable the logging.

First of all, this is the code to query data from Graphql:

from gql import Client, gql
from gql.transport.aiohttp import AIOHTTPTransport
import os
import pandas as pd

transport = AIOHTTPTransport(url=os.environ["GRAPHQL_ENDPOINT"])

# Create a GraphQL client using the defined transport
client = Client(transport=transport, fetch_schema_from_transport=True)

provide_vectors = gql(
    """
    query provideVectors($query: PublicationVectorsRequestDto!) {
        provideVectors(provideVectors: $query) {
            chunk,data{id, vectors}
        }
    }
    """
)


def get_vectors(index: int, size=5) -> pd.DataFrame:
    params = {"query": {"chunk": index, "chunkSize": size}}
    try:
        response = client.execute(provide_vectors, variable_values=params)["provideVectors"]
        data = response["data"]
        return pd.DataFrame.from_dict(data)
    except Exception as e:
        print("Could not connect to graphql server because:" + "\n" + str(e))
        return pd.DataFrame()

My celery setup looks like this:

celery = Celery(__name__)
celery.config_from_object(celeryconfig)
logger = logging.getLogger(__name__)

@after_setup_logger.connect
def setup_loggers(logger, *args, **kwargs):
    formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

    # FileHandler
    fh = logging.FileHandler('celery_logs.log')
    fh.setFormatter(formatter)
    logger.addHandler(fh)

    logger.addHandler(TqdmLoggingHandler(logging.DEBUG))


@celery.task(name="task")
def task():
    logger.info("Starting fetch")
    result = graphql_backend.get_all_vectors()

These code snippets are from different files.

gql says in it's doc that logging can be diabled by adding:

from gql.transport.requests import log as requests_logger
requests_logger.setLevel(logging.WARNING)

I added this to the setup_loggers function and the gql code. Both seem to change nothing. I also tried adding

logging.getLogger("requests").setLevel(logging.WARNING)
logging.getLogger("urllib3").setLevel(logging.WARNING)

to the setup_loggers function.

I know a workaround is setting the loglevel of the celery worker to WARNING, but this would break other parts of the application. Is there a way to disable or filter the logger of the gql module or is there another module to query GraphQL that does not produce logs (I have tried https://github.com/profusion/sgqlc)?


Solution

  • I figured out a solution: by adding

    gql_logger = logging.getLogger("gql.transport.aiohttp")
    gql_logger.addFilter(NoGQLFilter())
    class NoGQLFilter(logging.Filter):
        def filter(self, record):
            return False
    

    to the top of my celery worker python file, the gql logs get filtered. There surely is a better way to do this, but it works.