Search code examples
pythonjupyter-notebookschedulerdask

Dask - How to connect to running cluster scheduler and access 'total_occupancy'?


I use the following to create a local cluster from a Jupyter notebook :

from dask.distributed import Client, LocalCluster

cluster = LocalCluster(n_workers=24)
c = Client(cluster)

Is it possible to connect from another notebook when the kernel is occupied (compute operation) ?

My goal is to access to 'total_occupancy' for example.


Solution

  • As suggested by @moshevi you can connect to the scheduler by providing the address.

    client = Client("address-of-scheduler")
    

    Then you can use the client.run_on_scheduler command to execute operations on the remote scheduler

    client.run_on_scheduler(lambda dask_scheduler: dask_scheduler.total_occupancy)
    

    https://docs.dask.org/en/latest/futures.html#distributed.Client.run_on_scheduler