Search code examples
postgresqldockergitpod

PostgreSQL and Docker: password authentication failed


I'm using Gitpod to generate a container via Docker (because I'm using Gitpod, I don't have access to Docker command line). My goal is to install Python and PostgreSQL. This is my current Dockerfile:

# Base image is one of Python official distributions.
FROM python:3.8.13-slim-buster

# Update libraries and install sudo.
RUN apt update
RUN apt -y install sudo

# Install curl.
RUN sudo apt install -y curl

# Install git.
RUN sudo apt install install-info
RUN sudo apt install -y git-all

# Install nodejs.
RUN curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash -
RUN sudo apt install -y nodejs

# Download Google Cloud CLI installation script.
RUN mkdir -p /tmp/google-cloud-download
RUN curl -sSL https://sdk.cloud.google.com > /tmp/google-cloud-download/install.sh

# Install Google Cloud CLI.
RUN mkdir -p /gcloud
RUN bash /tmp/google-cloud-download/install.sh --install-dir=/gcloud --disable-prompts

# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . ./

# Install production dependencies.
RUN pip install --no-cache-dir -r requirements.txt

# Set some variables and create gitpod user.
ENV PGWORKSPACE="/workspace/.pgsql"
ENV PGDATA="$PGWORKSPACE/data"
RUN sudo mkdir -p $PGDATA
RUN useradd -l -u 33333 -G sudo -md /home/gitpod -s /bin/bash -p gitpod gitpod
RUN sudo chown gitpod $PGWORKSPACE -R

# Declare Django env variables.
ENV DJANGO_DEBUG=True
ENV DJANGO_DB_ENGINE=django.db.backends.postgresql_psycopg2

# Declare Postgres env variables. Note that these variables
# cannot be renamed since they are used by Postgres.
# https://www.postgresql.org/docs/current/libpq-envars.html
ENV PGDATABASE=postgres
ENV PGUSER=gitpod
ENV PGPASSWORD=gitpod
ENV PGHOST=localhost
ENV PGPORT=5432

# Install PostgreSQL 14. Note that this block needs to be located
# after the env variables are specified, since it uses POSTGRES_DB,
# POSTGRES_USER and POSTGRES_PASSWORD to create the first user.
RUN curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc|sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg
RUN sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
RUN sudo apt -y update
RUN sudo apt -y install postgresql-14

USER gitpod

# Set some more variables and init the db.
ENV PATH="/usr/lib/postgresql/14/bin:$PATH"
RUN mkdir -p ~/.pg_ctl/bin ~/.pg_ctl/sockets
RUN initdb -D $PGDATA
RUN printf '#!/bin/bash\npg_ctl -D $PGDATA -l ~/.pg_ctl/log -o "-k ~/.pg_ctl/sockets" start\n' > ~/.pg_ctl/bin/pg_start
RUN printf '#!/bin/bash\npg_ctl -D $PGDATA -l ~/.pg_ctl/log -o "-k ~/.pg_ctl/sockets" stop\n' > ~/.pg_ctl/bin/pg_stop
RUN chmod +x ~/.pg_ctl/bin/*
ENV PATH="$HOME/.pg_ctl/bin:$PATH"
ENV DATABASE_URL="postgresql://gitpod@localhost"
ENV PGHOSTADDR="127.0.0.1"

At this point, I would expect to have a database called postgres with a default user called gitpod, which is also the name of my default user in bash. However, when I try to use psql -h localhost I receive an error saying:

psql: error: connection to server at "127.0.0.1", port 5432 failed: FATAL:  password authentication failed for user "gitpod"
connection to server at "127.0.0.1", port 5432 failed: FATAL:  password authentication failed for user "gitpod"

Despite the message, I believe no user was actually created. I receive the same message if I try to login with a random string as a username.

I also tried:

echo "host all all 127.0.0.1/32 trust" | sudo tee -a /etc/postgresql/14/main/pg_hba.conf

It doesn't change anything.

echo "host all all localhost trust" | sudo tee -a /etc/postgresql/14/main/pg_hba.conf

It doesn't change anything.

sudo -u gitpod psql postgres

It returns the error: psql: error: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: role "gitpod" does not exist


Solution

  • If I create a Gitpod workspace using your Dockerfile to build a custom image (by starting with https://gitpod.io#https://github.com/larsks/so-example-73710360-gitpod-test/tree/main), what I find when I open the terminal is that gitpod mounts your workspace on /workspace, so anything you place there in your Dockerfile (such as the postgres data directory) isn't available at runtime.

    If in the terminal I run:

    $ mkdir /workspace/data
    $ initdb
    $ ~/.pg_ctl/bin/pg_start
    

    Then postgres runs correctly, and I can connect to it using psql without problems:

    $ psql
    psql (14.5 (Debian 14.5-1.pgdg100+1))
    Type "help" for help.
    
    postgres=#
    

    We can fix this in the Dockerfile by installing the database outside of /workspace. For example, we can:

    • Use /data/pgdata for the database
    • Use /data/sockets for the sockets directory
    • Use /usr/local/bin for our pg_start/pg_stop scripts

    I've made a Dockerfile with these changes (and a few others); you can try it out at https://gitpod.io#https://github.com/larsks/so-example-73710360-gitpod-test/tree/fixed.

    With these changes, pg_start runs without a problem, and psql connects successfully:

    gitpod@larsks-soexample7371036-htkkrp9bsl7:/workspace/so-example-73710360-gitpod-test$ pg_start
    waiting for server to start.... done
    server started
    gitpod@larsks-soexample7371036-htkkrp9bsl7:/workspace/so-example-73710360-gitpod-test$ psql
    psql (14.5 (Debian 14.5-1.pgdg100+1))
    Type "help" for help.
    
    postgres=#
    

    I've made a few other changes to the Dockerfile because I couldn't help myself. In particular:

    • I've removed the unnecessary use of sudo throughout the file

    • We can speed up the build process by passing multiple package names to apt install, rather than installing each one individually.

    • I've improved local image build times by re-arranging things to be more cache efficient.

      In particular, when you COPY . ./, any change to a file in your working directory will invalidate the cache, so any build steps after that need to be re-executed. By doing as much as possible before that COPY statement we can significantly decrease the time it takes to rebuild the image locally.