I am using celery in a slightly complex application. One of my tasks queries large arrays from a GraphQL backend. I want to use the celery logger, but when using the logger with loglever <= Info I see the whole request and answer from the gql client in the logs.
This is really annoying since now all the important information is just buried somewhere in the logs. I haven't worked with Python logging a lot but I already tried a few things to disable the logging.
First of all, this is the code to query data from Graphql:
from gql import Client, gql
from gql.transport.aiohttp import AIOHTTPTransport
import os
import pandas as pd
transport = AIOHTTPTransport(url=os.environ["GRAPHQL_ENDPOINT"])
# Create a GraphQL client using the defined transport
client = Client(transport=transport, fetch_schema_from_transport=True)
provide_vectors = gql(
"""
query provideVectors($query: PublicationVectorsRequestDto!) {
provideVectors(provideVectors: $query) {
chunk,data{id, vectors}
}
}
"""
)
def get_vectors(index: int, size=5) -> pd.DataFrame:
params = {"query": {"chunk": index, "chunkSize": size}}
try:
response = client.execute(provide_vectors, variable_values=params)["provideVectors"]
data = response["data"]
return pd.DataFrame.from_dict(data)
except Exception as e:
print("Could not connect to graphql server because:" + "\n" + str(e))
return pd.DataFrame()
My celery setup looks like this:
celery = Celery(__name__)
celery.config_from_object(celeryconfig)
logger = logging.getLogger(__name__)
@after_setup_logger.connect
def setup_loggers(logger, *args, **kwargs):
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
# FileHandler
fh = logging.FileHandler('celery_logs.log')
fh.setFormatter(formatter)
logger.addHandler(fh)
logger.addHandler(TqdmLoggingHandler(logging.DEBUG))
@celery.task(name="task")
def task():
logger.info("Starting fetch")
result = graphql_backend.get_all_vectors()
These code snippets are from different files.
gql says in it's doc that logging can be diabled by adding:
from gql.transport.requests import log as requests_logger
requests_logger.setLevel(logging.WARNING)
I added this to the setup_loggers function and the gql code. Both seem to change nothing. I also tried adding
logging.getLogger("requests").setLevel(logging.WARNING)
logging.getLogger("urllib3").setLevel(logging.WARNING)
to the setup_loggers function.
I know a workaround is setting the loglevel of the celery worker to WARNING, but this would break other parts of the application. Is there a way to disable or filter the logger of the gql module or is there another module to query GraphQL that does not produce logs (I have tried https://github.com/profusion/sgqlc)?
I figured out a solution: by adding
gql_logger = logging.getLogger("gql.transport.aiohttp")
gql_logger.addFilter(NoGQLFilter())
class NoGQLFilter(logging.Filter):
def filter(self, record):
return False
to the top of my celery worker python file, the gql logs get filtered. There surely is a better way to do this, but it works.