Search code examples
pythondjangocelerydjango-celery

Django Celery tasks run twice


My project directory structure is as follows:

rss_reader  
 - reader  
    - __init__.py   
    - models.py  
    - views.py  
    - ...  
 - rss_reader  
    - __init__.py   
    - settings.py  
    - urls.py  
    - wsgi.py  
 - rss_contents  
    - __init__.py   
    - celery.py  
    - tasks.py  
    - config.py  
 - ...  

tasks.py

from __future__ import absolute_import
from rss_contents.celery import app

@app.task
def processArticle():
    print "100"

celery.py

from __future__ import absolute_import
from celery import Celery

app = Celery('rss_contents', include=['rss_contents.tasks'])
app.config_from_object('rss_contents.config')

config.py

from __future__ import absolute_import

CELERY_RESULT_BACKEND = 'redis://127.0.0.1:6379/5'
BROKER_URL = 'redis://127.0.0.1:6379/6'

reader > __init__.py

from rss_contents import tasks
tasks.processArticle.delay()

My steps are as follows:

  1. celery -A rss_contents worker -l info

The console displays:

[2017-01-11 10:34:33,829: INFO/MainProcess] Connected to redis://127.0.0.1:6379/6  
[2017-01-11 10:34:33,855: INFO/MainProcess] mingle: searching for neighbors  
[2017-01-11 10:34:34,861: INFO/MainProcess] mingle: all alone  
[2017-01-11 10:34:34,892: WARNING/MainProcess] celery@DESKTOP-6KAT7MF ready.
  1. Run server, Starting development server at http://127.0.0.1:8000/

But the celery console shows that the task was run twice:

[2017-01-11 10:41:20,910: INFO/MainProcess] Received task: rss_contents.tasks.processArticle[aa355c77-4ee8-4208-9e8c-915b110c7bbd]  
[2017-01-11 10:41:20,911: WARNING/Worker-1] 100
[2017-01-11 10:41:20,917: INFO/MainProcess] Task rss_contents.tasks.processArticle[aa355c77-4ee8-4208-9e8c-915b110c7bbd] succeeded in 0.00600004196167s: None  

[2017-01-11 10:41:22,430: INFO/MainProcess] Received task: rss_contents.tasks.processArticle[d92a151c-f0f9-4e8f-921a-fff2c1eb64c6]  
[2017-01-11 10:41:22,431: WARNING/Worker-1] 100   
[2017-01-11 10:41:22,447: INFO/MainProcess] Task rss_contents.tasks.processArticle[d92a151c-f0f9-4e8f-921a-fff2c1eb64c6] succeeded in 0.0160000324249s: None

Why is it running twice? How can I do it once?

Thanks in advance.


Solution

  • @enix nailed it on the head. Because reader is an app, it is called once when django initializes the application registry. But, since this invocation is made dynamically, the reader app may be getting initialized a second time because there is a global use of the reader package somewhere. In any case, the easiest way to fix this is just to move your call to a class and call it from a singleton in any python file.

    class AppInitializer(object):
        initialized = false
    
        @classmethod 
        def initialize(cls):
            if not cls.initialized:
                cls.initialized = True
                from rss_contents import tasks
                tasks.processArticle.delay()
    
    AppInitializer.initialize()
    

    If your appengine is multi-threaded, you’ll want to add a method lock around initialize.