Search code examples
airflowgoogle-cloud-composerdbt

dbt in Cloud Composer: general approach


My team currently evaluate migrating from dbt Cloud to Google Cloud Composer. I have some (not much) experience with workflow orchestration from my previous project, where we would run Spark Jobs via Argo, but I am unsure if a similar approach works with dbt (or is recommended). What I would like to know is:

  • is it a valid/common approach to create a Docker Image from our dbt project in Github and refer to this Image in the DAGs.
  • I figure that with this approach, I also need to install/setup dbt Core within the Docker Container, correct?
  • is this way over-engineered and are there more common/leaner ways to run dbt in Cloud Composer?

Thanks a lot in advance for any hint!


Solution

  • Yes you also need to install/setup dbt Core within the Docker Container per this doc.The same also says "Using a prebuilt Docker image to install dbt Core in production has a few benefits: it already includes dbt-core, one or more database adapters, and pinned versions of all their dependencies."