In the celery docs, section Instantiation (http://celery.readthedocs.org/en/latest/userguide/tasks.html#custom-task-classes) the following is stated:
A task is not instantiated for every request, but is registered in the task registry as a global instance.
This means that the init constructor will only be called once per process, and that the task class is semantically closer to an Actor.
Nevertheless, when I run the following example I see that the init method is called at least 3 times. What is wrong in the setup? The CELERYD_CONCURRENCY = 1
should make sure that there is only one process per worker, right?
$ celery -A proj beat
celery beat v3.1.17 (Cipater) is starting.
init Task1
40878160
x=1.0
init Task1
40878352
x=1.0
init Task1
40879312
x=1.0
__ - ... __ - _
Configuration ->
. broker -> amqp://guest:**@localhost:5672//
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]@%INFO
. maxinterval -> now (0s)
[2015-02-05 23:05:21,875: INFO/MainProcess] beat: Starting...
[2015-02-05 23:05:21,971: INFO/MainProcess] Scheduler: Sending due task task1-every-5-seconds (proj.tasks.t1)
[2015-02-05 23:05:26,972: INFO/MainProcess] Scheduler: Sending due task task1-every-5-seconds (proj.tasks.t1)
celery.py:
from __future__ import absolute_import
from datetime import timedelta
from celery import Celery
app = Celery('proj',
broker='amqp://guest@localhost//',
backend='amqp://',
include=['proj.tasks'])
app.conf.update(
CELERY_REDIRECT_STDOUTS=True,
CELERY_TASK_RESULT_EXPIRES=60,
CELERYD_CONCURRENCY = 1,
CELERYBEAT_SCHEDULE = {
'task1-every-5-seconds': {
'task': 'proj.tasks.t1',
'schedule': timedelta(seconds=5)
},
},
CELERY_TIMEZONE = 'GMT',
)
if __name__ == '__main__':
app.start()
tasks.py:
from __future__ import absolute_import
from proj.celery import app
from celery import Task
import time
class Foo():
def __init__(self, x):
self.x = x
class Task1(Task):
abstract = True
def __init__(self):
print "init Task1"
print id(self)
self.f = Foo(1.0)
print "x=1.0"
@app.task(base=Task1)
def t1():
t1.f.x +=1
print t1.f.x
So, as per your comment, you need to maintain one connection per thread.
Why not to use a thread storage then? It should be a safe solution in your case.
from threading import local
thread_storage = local()
def get_or_create_conntection(*args, **kwargs):
if not hasattr(thread_storage, 'connection'):
thread_storage.connection = Connection(*args, **kwargs)
return thread_storage.connection
@app.task()
def do_stuff():
connection = get_or_create_connection('some', connection='args')
connection.ping()