I am running apache-airflow 1.8.1 on AWS ECS and I have an AWS ElastiCache cluster (redis 3.2.4) running 2 shards / 2 nodes with multi-AZ enabled (clustered redis engine). I've verified that airflow can access the host/port of the cluster without any problem.
Here's the logs:
Thu Jul 20 01:39:21 UTC 2017 - Checking for redis (endpoint: redis://xxxxxx.xxxxxx.clustercfg.usw2.cache.amazonaws.com:6379) connectivity
Thu Jul 20 01:39:21 UTC 2017 - Connected to redis (endpoint: redis://xxxxxx.xxxxxx.clustercfg.usw2.cache.amazonaws.com:6379)
logging to s3://xxxx-xxxx-xxxx/logs/airflow
Starting worker
[2017-07-20 01:39:44,020] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-07-20 01:39:45,960] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2017-07-20 01:39:45,989] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
[2017-07-20 01:39:53,352] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-07-20 01:39:55,187] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2017-07-20 01:39:55,210] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
[2017-07-20 01:53:09,536: ERROR/MainProcess] Unrecoverable error: ResponseError("CROSSSLOT Keys in request don't hash to the same slot",)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/celery/worker/__init__.py", line 206, in start
self.blueprint.start(self)
File "/usr/local/lib/python2.7/dist-packages/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/usr/local/lib/python2.7/dist-packages/celery/bootsteps.py", line 374, in start
return self.obj.start()
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 278, in start
blueprint.start(self)
File "/usr/local/lib/python2.7/dist-packages/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 569, in start
replies = I.hello(c.hostname, revoked._data) or
{}
File "/usr/local/lib/python2.7/dist-packages/celery/app/control.py", line 112, in hello
return self._request('hello', from_node=from_node, revoked=revoked)
File "/usr/local/lib/python2.7/dist-packages/celery/app/control.py", line 71, in _request
timeout=self.timeout, reply=True,
File "/usr/local/lib/python2.7/dist-packages/celery/app/control.py", line 307, in broadcast
limit, callback, channel=channel,
File "/usr/local/lib/python2.7/dist-packages/kombu/pidbox.py", line 294, in _broadcast
serializer=serializer)
File "/usr/local/lib/python2.7/dist-packages/kombu/pidbox.py", line 259, in _publish
maybe_declare(self.reply_queue(channel))
File "/usr/local/lib/python2.7/dist-packages/kombu/common.py", line 120, in maybe_declare
return _maybe_declare(entity, declared, ident, channel)
File "/usr/local/lib/python2.7/dist-packages/kombu/common.py", line 127, in _maybe_declare
entity.declare()
File "/usr/local/lib/python2.7/dist-packages/kombu/entity.py", line 522, in declare
self.queue_declare(nowait, passive=False)
File "/usr/local/lib/python2.7/dist-packages/kombu/entity.py", line 548, in queue_declare
nowait=nowait)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/virtual/__init__.py", line 447, in queue_declare
return queue_declare_ok_t(queue, self._size(queue), 0)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/redis.py", line 690, in _size
sizes = pipe.execute()
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 2626, in execute
return execute(conn, stack, raise_on_error)
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 2518, in _execute_transaction
response = self.parse_response(connection, '_')
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 2584, in parse_response
self, connection, command_name, **options)
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 585, in parse_response
response = connection.read_response()
File "/usr/local/lib/python2.7/dist-packages/redis/connection.py", line 582, in read_response
raise response
ResponseError: CROSSSLOT Keys in request don't hash to the same slot
I had the exact same issue, I solved it by not using a clustered setup with elasticache. Perhaps celery workers don't support using clustered Redis, I was unable to find any information that definitively pointed this out.