When I run this I except to see a progress bar, but I don't.
from math import factorial
from dask.diagnostics import ProgressBar
from dask.distributed import Client
def dask_progress():
client = Client()
print(client)
m = client.map(factorial, range(10))
with ProgressBar():
print(client.gather(m))
if __name__ == "__main__":
dask_progress()
This is the output.
<Client: 'tcp://127.0.0.1:65468' processes=4 threads=8, memory=17.18 GB>
[1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880]
I see the same thing with this
from math import factorial
from dask.distributed import Client, progress
def dask_progress():
client = Client()
print(client)
m = client.map(factorial, range(10))
progress(m)
print(client.gather(m))
if __name__ == "__main__":
dask_progress()
What do I need to do in order to see the progress bar?
ProgressBar (local) vs progress (distributed)
As hinted at in the first answer, your first code block won't show the progress bar because you are using the distributed scheduler. The ProgressBar
is for use with the local scheduler. See https://docs.dask.org/en/latest/diagnostics-local.html
Terminal vs Notebook
Your second code block does not reproduce the error for me when I run it as a Python script in my terminal. It's a very fast operation, so the progress bar only shows for a split second; but it definitely appears.
However, when I run your second code block in a Jupyter notebook, the ProgressBar
indeed doesn't appear. If you are working in a notebook you could do this:
m = client.map(factorial, range(1000))
progress(m)
Dask Docs
This section of the Dask docs might be relevant here: "In the notebook, the output of progress must be the last statement in the cell. Typically, this means calling progress at the end of a cell." I suspect that wrapping the progress
call within a if __name__ == "__main__":
is causing issues behind the scenes here.