Search code examples
dockergoogle-cloud-rundbt

Docker container runs locally but fails on Cloud Run to serve dbt docs


Idea here is simple - dbt provides a way to generate static files and serve them by using commands dbt docs generate and dbt docs serve and I want to share in a way that everyone in my organization can see them (bypassing security concerns as of now). For this task I thought Cloud Run would be ideal solution as I already have Dockerfile and bash scrips which do some background work (cron job to clone git repo every x hours, etc.). Running this container locally works fine. But deploying this image in Cloud Run wasn't successful - it fails on the last step (which is dbt docs server --port 8080) with default error message Cloud Run error: The user-provided container failed to start and listen on the port defined provided by the PORT=8080 environment variable. Logs for this revision might contain more information. No additional information in logs before that wasn't printed.

Dockerfile:

FROM --platform=$build_for python:3.9.9-slim-bullseye 
WORKDIR /usr/src/dbtdocs
RUN apt-get update && apt-get install -y --no-install-recommends git apt-transport-https ca-certificates gnupg curl cron \
    && apt-get clean
RUN DEBIAN_FRONTEND=noninteractive apt-get -y install tzdata
RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key --keyring /usr/share/keyrings/cloud.google.gpg  add - && apt-get update -y && apt-get install google-cloud-sdk -y
RUN python -m pip install --upgrade pip setuptools wheel --no-cache-dir
RUN pip install dbt-bigquery
RUN ln -s /usr/local/bin/dbt /usr/bin/
RUN rm -rf /var/lib/apt/lists/*
COPY ./api-entrypoint.sh /usr/src/dbtdocs/
COPY ./cron_dbt_docs.sh /usr/src/dbtdocs/
COPY ./cron_script.sh /usr/src/dbtdocs/
ENV PORT=8080
RUN chmod 755 api-entrypoint.sh
RUN chmod 755 cron_dbt_docs.sh
RUN chmod 755 cron_script.sh
ENTRYPOINT ["/bin/bash", "-c", "/usr/src/dbtdocs/api-entrypoint.sh" ] ```

api-entrypoint.sh

#!/bin/bash

#set -e
#catch() {
#    echo 'catching!'
#    if [ "$1" != "0" ]; then
#    echo "Error $1 occurred on $2"
#    fi
#}
#trap 'catch $? $LINENO' EXIT
exec 2>&1
echo 'Starting DBT Workload'
echo 'Checking dependencies'

dbt --version
git --version

mkdir -p /data/dbt/ && cd /data/dbt/
echo 'Cloning dbt Repo'
git clone ${GITLINK} /data/dbt/

echo 'Working on dbt directory'
export DBT_PROFILES_DIR=/data/dbt/profile/

echo "Authentificate at GCP"
echo "Decrypting and saving sa.json file"
mkdir -p /usr/src/secret/
echo "${SA_SECRET}" | base64 --decode > /usr/src/secret/sa.json
gcloud auth activate-service-account ${SA_EMAIL} --key-file /usr/src/secret/sa.json
echo 'The Project set'
if test "${PROJECT_ID}"; then
    gcloud config set project ${PROJECT_ID}
    gcloud config set disable_prompts true
else
    echo "Project Name not in environment variables ${PROJECT_ID}"
fi
echo 'Use Google Cloud Secret Manager Secret'
if test "${PROFILE_SECRET_NAME}"; then
    #mkdir -p /root/.dbt/
    mkdir -p /root/secret/
    gcloud secrets versions access latest --secret="${PROFILE_SECRET_NAME}" > /root/secret/creds.json
    export GOOGLE_APPLICATION_CREDENTIALS=/root/secret/creds.json
else
    echo 'No Secret Name described - GCP Secret Manager'
fi

echo 'Apply cron Scheduler'
sh -c "/usr/src/dbtdocs/cron_script.sh install"
/etc/init.d/cron restart
touch /data/dbt_docs_job.log
sh -c "/usr/src/dbtdocs/cron_dbt_docs.sh"
touch /data/cron_up.log
tail -f /data/dbt_docs_job.log &
tail -f /data/cron_up.log &
dbt docs serve --port 8080

Container port is set to 8080 when creating Cloud Run service, so I don't think it's a problem here. Have someone actually encountered similar problems using Cloud Run?

Logs in Cloud Logging


Solution

  • Your container is not listening/responding on port 8080 and has been terminated before the server process starts listening.

    Review the last line in the logs. The previous line is building catalog.

    Your container is taking too long to startup. Containers should start within 10 seconds because Cloud Run will only keep pending requests for 10 seconds.

    All of the work I see in the logs should be performed before the container is deployed and not during container start.

    The solution is to redesign how you are building and deploying this container so that the application begins responding to requests as soon as the container starts.