Search code examples
daskdask-distributed

Dask.distributed performance report is not working when Client is not default scheduler


I just tried to create a performance report for a Client() with LocalCluster. However, it seems, that the performance report is only working, when the Client() is registered as default scheduler (set_as_default=True).

import dask.distributed as dd

cluster = dd.LocalCluster(n_workers=2, threads_per_worker=4, memory_limit='5GiB')
client = dd.Client(cluster, set_as_default=False)
    
with dd.performance_report(filename='dask-report.html'):
    print(client)
    print(cluster)

Do I miss something?

The following error message is created:

Traceback (most recent call last):
  File “bla/test.py”, line 9, in <module>
    with dd.performance_report(filename='dask-report.html'):
  File "bla/lib/python3.10/site-packages/distributed/client.py", line 5497, in __enter__
    get_client().sync(self.__aenter__)
  File "bla/lib/python3.10/site-packages/distributed/worker.py", line 2771, in get_client
    raise ValueError("No global client found and no address provided")
ValueError: No global client found and no address provided

Is there a way to register the scheduler globally?

Thanks in advance!


Solution

  • After a Dask feature request was created based on this thread, Florian Jetter provided a currently working solution:

    If you don't want to set the client as default, you can also use Client.as_current which sets the client as the active default client for as long as you are in the contextmanager.

    import dask.distributed as dd
        
    cluster = dd.LocalCluster(n_workers=2, threads_per_worker=4, memory_limit='5GiB')
    client = dd.Client(cluster, set_as_default=False)
            
    with client.as_current():
        with dd.performance_report(filename='dask-report.html'):
            print(client)
            print(cluster)