Search code examples
pipelinedagster

How do I tell Dagit (the Dagster GUI) to run on an existing Dask cluster?


I'm using dagster 0.11.3 (the latest as of this writing)

I've created a Dagster pipeline (saved as pipeline.py) that looks like this:

@solid
def return_a(context):
    return 12.34


@pipeline(
    mode_defs=[
        ModeDefinition(
            executor_defs=[dask_executor]  # Note: dask only!
        )
    ]
)
def the_pipeline():
    return_a()

I have the DAGSTER_HOME environment variable set to a directory that contains a file named dagster.yaml, which is an empty file. This should be ok because the defaults are reasonable based on these docs: https://docs.dagster.io/deployment/dagster-instance.

I have an existing Dask cluster running at "scheduler:8786". Based on these docs: https://docs.dagster.io/deployment/custom-infra/dask, I created a run config named config.yaml that looks like this:

execution:
  dask:
    config:
      cluster:
        existing:
          address: "scheduler:8786"

I have SUCCESSFULLY used this run config with Dagster like so:

$ dagster pipeline execute -f pipeline.py -c config.yaml

(I checked the Dask logs and made sure that it did indeed run on my Dask cluster)

My question is: How can I get Dagit to use this Dask cluster? The only thing I have found that seems related is this: https://docs.dagster.io/_apidocs/execution#executors

...but it doesn't even mention Dask as an option (it has dagster.in_process_executor and dagster.multiprocess_executor, which don't seem at all related to dask).

Probably I need to configure dagster-dask, which is documented here: https://docs.dagster.io/_apidocs/libraries/dagster-dask#dask-dagster-dask

...but where do I put that run config when using Dagit? There's no way to feed config.yaml to Dagit, for example.


Solution

  • Some options:

    Given the context, I would recommend the configured API