Is it possible (and/or would it be effective) to use Postgres' hstore as a broker for celery?
I'm restricted (absent some very compelling reason) to using a Postgres db. I have a django app with celery tasks. Currently I am using the standard database support, but the celery docs strongly recommend against that approach for anything beyond very small task queues. I was looking into installing redis when I came across some info about the hstore feature of Postgres, and the suggestion that it provides equivalent functionality to redis.
I haven't seen anything about using hstore specifically for celery, though, which seems odd if it really can substitute for redis. Looking through the celery backend code at
https://github.com/celery/celery/blob/master/celery/backends/base.py
it looks like the base celery KeyValueStoreBackend is a pretty simple api:
def get(self, key):
raise NotImplementedError('Must implement the get method.')
def mget(self, keys):
raise NotImplementedError('Does not support get_many')
def set(self, key, value):
raise NotImplementedError('Must implement the set method.')
def delete(self, key):
raise NotImplementedError('Must implement the delete method')
def incr(self, key):
raise NotImplementedError('Does not implement incr')
but before I potentially pour a bunch of time into this it seemed worth asking whether there's something I'm missing that would argue against implementing this API using hstore and using that as a celery backend.
eg. Does celery have requirements that aren't captured by this API (eg. atomicity, scalability, reliability under load)? Would implementing this using hstore fail to provide a substantial improvement over the existing database backend? I'm fairly new to celery and never used hstore, so I'm not sure what (if anything) I'm overlooking.
hstore
absolutely does not provide "equivalent functionality to redis".
A hstore
field is not a key-value-DB in a field. Trying to use it that way will lead to pain and terrible performance. The whole record containing the hstore
field must be re-written for every update. Additionally, the same challenges as apply with task queuing in a relational DB apply with hstore
, meaning that you'll get at best the performance of a single worker, you won't get concurrency even though it might superficially look like you do.
All hstore
is is a hash-map in a database field. It's very useful, but it's not magic, and it won't free you from the underlying challenges of using a RDBMS for message queuing.
If you want a message queue, use a message queue. PGQ is one good option. Alternately check out dedicated message queue tools like ZeroMQ.