Search code examples
pythonkubernetesdaskdask-distributeddask-kubernetes

Error trying to use Dask on Kubernetes with distributed workers


I'm attempting to deploy a dask application on Kubernetes/Azure. I have a Flask application server that is the client of a Dask scheduler/workers.

I installed the Dask operator as described here:

helm install --repo https://helm.dask.org --create-namespace -n dask-operator --generate-name dask-kubernetes-operator

This created the scheduler and worker pods, I have them running on Kubernetes without errors.

For the Flask application, I have a Docker image with the following Dockerfile:

FROM daskdev/dask

RUN apt-get -y install python3-pip

RUN pip3 install flask 
RUN pip3 install gunicorn 
RUN pip3 install "dask[complete]"
RUN pip3 install "dask[distributed]" --upgrade
RUN pip3 install "dask-ml[complete]"

Whenever I try to run a function in the workers using the Client interface, I get this error in the scheduler pod:

TypeError: update_graph() got an unexpected keyword argument 'graph_header'

It seems to me that the Dask image used to run Flask and the Dask Kubernetes that I installed are not compatible or aligned?

How to create an image that includes Dask for the Flask server that can be integrated with the Dask Kubernetes package?

I run in Flask client.get_versions(check=True) and this is what I get:

{'scheduler': {'host': {'python': '3.8.15.final.0', 'python-bits': 64, 'OS': 'Linux', 'OS-release': '5.4.0-1105-azure', 'machine': 'x86_64', 'processor': 'x86_64', 'byteorder': 'little', 'LC_ALL': 'C.UTF-8', 'LANG': 'C.UTF-8'}, 'packages': {'python': '3.8.15.final.0', 'dask': '2023.1.0', 'distributed': '2023.1.0', 'msgpack': '1.0.4', 'cloudpickle': '2.2.0', 'tornado': '6.2', 'toolz': '0.12.0', 'numpy': '1.24.1', 'pandas': '1.5.2', 'lz4': '4.2.0'}}, 'workers': {'tcp://10.244.0.3:40749': {'host': {'python': '3.8.15.final.0', 'python-bits': 64, 'OS': 'Linux', 'OS-release': '5.4.0-1105-azure', 'machine': 'x86_64', 'processor': 'x86_64', 'byteorder': 'little', 'LC_ALL': 'C.UTF-8', 'LANG': 'C.UTF-8'}, 'packages': {'python': '3.8.15.final.0', 'dask': '2023.1.0', 'distributed': '2023.1.0', 'msgpack': '1.0.4', 'cloudpickle': '2.2.0', 'tornado': '6.2', 'toolz': '0.12.0', 'numpy': '1.24.1', 'pandas': '1.5.2', 'lz4': '4.2.0'}}, 'tcp://10.244.0.4:36757': {'host': {'python': '3.8.15.final.0', 'python-bits': 64, 'OS': 'Linux', 'OS-release': '5.4.0-1105-azure', 'machine': 'x86_64', 'processor': 'x86_64', 'byteorder': 'little', 'LC_ALL': 'C.UTF-8', 'LANG': 'C.UTF-8'}, 'packages': {'python': '3.8.15.final.0', 'dask': '2023.1.0', 'distributed': '2023.1.0', 'msgpack': '1.0.4', 'cloudpickle': '2.2.0', 'tornado': '6.2', 'toolz': '0.12.0', 'numpy': '1.24.1', 'pandas': '1.5.2', 'lz4': '4.2.0'}}, 'tcp://10.244.1.7:40561': {'host': {'python': '3.8.15.final.0', 'python-bits': 64, 'OS': 'Linux', 'OS-release': '5.4.0-1105-azure', 'machine': 'x86_64', 'processor': 'x86_64', 'byteorder': 'little', 'LC_ALL': 'C.UTF-8', 'LANG': 'C.UTF-8'}, 'packages': {'python': '3.8.15.final.0', 'dask': '2023.1.0', 'distributed': '2023.1.0', 'msgpack': '1.0.4', 'cloudpickle': '2.2.0', 'tornado': '6.2', 'toolz': '0.12.0', 'numpy': '1.24.1', 'pandas': '1.5.2', 'lz4': '4.2.0'}}}, 'client': {'host': {'python': '3.8.16.final.0', 'python-bits': 64, 'OS': 'Linux', 'OS-release': '5.4.0-1105-azure', 'machine': 'x86_64', 'processor': 'x86_64', 'byteorder': 'little', 'LC_ALL': 'C.UTF-8', 'LANG': 'C.UTF-8'}, 'packages': {'python': '3.8.16.final.0', 'dask': '2023.4.0', 'distributed': '2023.4.0', 'msgpack': '1.0.5', 'cloudpickle': '2.2.1', 'tornado': '6.2', 'toolz': '0.12.0', 'numpy': '1.23.5', 'pandas': '2.0.0', 'lz4': '4.3.2'}}} @ 2023-04-20 13:33:09.921545"}


Solution

  • Solved, just forced the Dockerfile to use version 2023.1.0, that fixed the problem and matched the operator dask version.