Search code examples
mysqlpython-3.xapache-kafkadjango-celeryconfluent-platform

Database connection error while celery worker remains idle for 24 hours


I have a django based web application where I am using Kafka to process some orders. Now I use Celery Workers to assign a Kafka Consumer to each topics. Each Kafka Consumer is assigned to a Kafka topic in the form of a Kafka tasks. However after a day or so, when I am submitting a task I am getting the following error :

    _mysql.connection.query(self, query)
_mysql_exceptions.OperationalError: (2006, 'MySQL server has gone away')

The above exception was the direct cause of the following exception:

Below is how my tasks.py file looks like :

@shared_task
def init_kafka_consumer(topic):
    try:
        if topic is None:
            raise Exception("Topic is none, unable to initialize kafka consumer")
        logger.info("Spawning new task to subscribe to topic")
        params = []
        params.append(topic)
        background_thread = Thread(target=sunscribe_consumer, args=params)
        background_thread.start()
    except Exception :
        logger.exception("An exception occurred while reading message from kafka")

def sunscribe_consumer(topic) :
    try:
        if topic is None:
            raise Exception("Topic is none, unable to initialize kafka consumer")
        conf = {'bootstrap.servers': "localhost:9092", 'group.id': 'test', 'session.timeout.ms': 6000,
                'auto.offset.reset': 'earliest'}
        c = Consumer(conf)
        logger.info("Subscribing consumer to topic "+str(topic[0]))
        c.subscribe(topic)
        # Read messages from Kafka
        try:
            while True:
                msg = c.poll(timeout=1.0)
                if msg is None:
                    continue
                if msg.error():
                    raise KafkaException(msg.error())
                else:
                    try:
                        objs = serializers.deserialize("json", msg.value())
                        for obj in objs:
                            order = obj.object
                            order = BuyOrder.objects.get(id=order.id) #Getting an error while accessing DB
                            if order.is_pushed_to_kafka :
                                return
                            order.is_pushed_to_kafka = True
                            order.save()
                            from web3 import HTTPProvider, Web3, exceptions
                            w3 = Web3(HTTPProvider(INFURA_MAIN_NET_ETH_URL))
                            processBuyerPayout(order,w3)
                    except Exception :
                        logger.exception("An exception occurred while de-serializing message")
        except Exception :
            logger.exception("An exception occurred while reading message from kafka")
        finally:
            c.close()
    except Exception :
        logger.exception("An exception occurred while reading message from kafka")

Is there anyway that I could check if database connection exists as soon as a task is received and if not, I can re-establish the connection?


Solution

  • According to https://github.com/celery/django-celery-results/issues/58#issuecomment-418413369 and comments above putting this code:

    from django.db import close_old_connections
    close_old_connections()
    

    which is closing old connection and opening new one inside your task should helps.