I am new to DBT and currently trying to build a Docker container where I can directly run DBT commands within. I have a file where I export env variables (envs.sh
) that looks like:
export DB_HOST="secret"
export DB_PWD="evenabiggersecret"
My packages.yml
looks like:
packages:
- package: fishtown-analytics/dbt_utils
version: 0.6.2
I structured my docker file like:
FROM fishtownanalytics/dbt:0.19.0b1
# Define working directory
WORKDIR /usr/app/profile/
ENV DBT_DIR /usr/app
ENV DBT_PROFILES_DIR /usr/app
# Load ENV Vars
COPY ./dbt ${DBT_DIR}
# Load env variables and install packages
COPY envs.sh envs.sh
RUN . ./envs.sh \
&& dbt deps # Exporting envs to avoid profile not found errors when install deps
However, when I run dbt run
inside the docker container I get the error:
'dbt_utils' is undefined
. When I manually run dbt deps
it seems to fix the issue and dbt run
succeeds. Am I missing something when I am originally installing the dependencies?
Update:
In other words, running dbt deps
when building the Docker image seems to have no effect. So I have to run it manually (when I do docker run for example) before I can start doing my workflows. This issue does not happen when I use a Python image (not the image from fishtown-analytics)
Running dbt deps
is a necessary step in preparing your dbt environment, so you should feel fine invoking dbt deps
in the Dockerfile
prior to dbt run
.
I think, however, your intention is getting lost in the RUN
instruction on the last line: either the last-line RUN
command should be converted to a CMD
instruction or you could perform a RUN dbt depts
by itself prior. (See this question for more detail on the differences between RUN
and CMD
.)
And, for what it's worth: dbt Cloud, the hosted SaaS build environment for dbt, also runs dbt deps
as one of its standard steps for all dbt build jobs -- meaning executing at run time, every time, similar to Docker's CMD
.