Search code examples
pythondistributeddask

ImportError in dask distributed clients


We have been using dask distributed on a compute cluster for the past few months. Recently we upgraded all our python packages, and now all the dask clients seem to error out with the following message:

distributed.nanny - INFO -         Start Nanny at: 'tcp://10.38.37.14:40983'
Traceback (most recent call last):
File "PYTHON_INSTALL_DIR/lib/python3.6/multiprocessing/forkserver.py", line 
178, in main
  _serve_one(s, listener, alive _r, handler)
File "PYTHON_INSTALL_DIR/lib/python3.6/multiprocessing/forkserver.py", line 212, in _serve_one 
    code = spawn._main(child_r)
File "PYTHON_INSTALL_DIR/lib/python3.6/multiprocessing/spawn.py", line 115, 
in _main
    self = reduction.pickle.load(from_parent)
ModuleNotFoundError: No module named 'distributed.http'
distributed.nanny - WARNING - Worker process 8566 was killed by unknown signal
distributed.nanny - WARNING - Restarting worker

Any idea what could be causing this issue ? FYI, I am getting this error with the following version of dask related packages:

  1. dask-0.18.1
  2. distributed-1.22.0

Thanks


Solution

  • distributed.http was indeed removed in recent versions. That you are getting this error suggests that you have incompatible version lingering in your install, or that your paths have become mixed up somehow. You could check things like $PATH, which python, which dask-worker, python -c 'import sys; print(sys.path)', and so on.

    I would recommend installing into a fresh virtual or conda env (personally I prefer conda).