Search code examples
apache-sparkgoogle-cloud-platformgoogle-cloud-dataprocgoogle-cloud-composer

Cloud Composer vs Cloud Dataproc Workflow Template


For running and orchestrating a few Spark jobs, some in parallel and some in a series. What's the main difference between orchestration using Cloud Composer vs. a Dataproc workflow template?


Solution

  • It's very similar. DAG in the 2 cases. The main differences are:

    • Dataproc Workflow use YAML workflow definition, can run only Dataproc jobs, and no additional cost
    • Composer use Python and operator, can run different type of jobs but require specific deployment (Composer Cluster) that incurs additional costs (about $400 per month for a small cluster)