My new company use DBT and I'm struggling to find the answer to this question.
In my example, each model is set to materialise as a table.
If I have models that reference a previously run model and produce a DAG like this:
Query1 > Query2 > Query3 > Query 4.
Will this cause DBT to:
read from the materialised table of Query1 when it hits {{ref ('Query1')}}
in Query2?
OR.
run all the SQL from Query1 again, and keep it in memory, when it hits {{ref ('Query1')}}
in Query2?
The former. dbt will replace {{ ref('Query1') }}
with the fully-qualified name of the database relation that is materialized from the model whose name is Query1
.
You can see the SQL that dbt actually executes in your target
directory (inside your project directory), under target/compiled/path/to/my/model
and/or target/run/path/to/my/model
. If you do that for Query2, you'll see {{ ref('Query1') }}
has been replaced with something like "my_database"."my_schema"."Query1"
.
This is true only because you said that all of these models are materialized as tables. There are other materializations supported by dbt:
"my_database"."my_schema"."Query1"
when you use {{ ref('Query1') }}
. But under the hood, in order to execute that query, your database will be re-executing the SQL defined in Query1
, since that is how views workQuery1
as ephemeral
, and then {{ ref('Query1) }}
in Query2
, dbt will inject a CTE with the code from Query1
into the compiled Query2
query. That will look like:-- Query1 is materialized as ephemeral
with "__dbt_cte_Query1" as (
-- the code from the Query1 model file
)
-- your code from the Query2 model
select
...
from
-- in your Query2 file, you used {{ ref('Query1') }} here
"__dbt_cte_Query1"
...