I have a few services that run python 3.7 with flask and only require a few extra libraries. One of them is psycopg2 to be able to connect to postgres.
In itself, installing psycopg2 in alpine is not a very difficult task but I had some problems finding documentation on the matter. I managed to get this dockerfile that runs OK. The biggest downside is that it is about 355MB and it's just too heavy.
This is my initial dockerfile before any optimization:
FROM python:3.7-alpine
ENV PATH /usr/local/bin:$PATH
ENV LANG C.UTF-8
RUN mkdir -p /usr/src/app
COPY requirements.txt /usr/src/app/
RUN apk update \
&& apk add postgresql-dev \
&& apk add --virtual temp1 gcc python3-dev musl-dev \
&& pip install --upgrade pip \
&& pip install psycopg2==2.8.4
RUN pip install -r /usr/src/app/requirements.txt
RUN apk del temp1
COPY . /usr/src/app
WORKDIR /usr/src/app
EXPOSE 6000
ENTRYPOINT ["python3"]
CMD ["-m", "server"]
And my requirements.txt
psycopg2 == 2.8.4
connexion == 1.1.15
python_dateutil == 2.6.0
loguru~=0.4.1
flask~=1.1.2
six~=1.14.0
Werkzeug==0.16.1
pymongo
PyYAML == 5.3
setuptools == 45.1.0
flask_testing == 0.7.1
mo-future>=3
pyparsing==2.3.1
mo_files
pycryptodomex
ldap3
Doing some testing, i found out that the steps that increase the most the size of the image are:
Things I tried to do to reduce its size:
I'm going to try to answer to these questions:
The first thing I wanted to do is removing postgresql-dev from the container and still being able to use psycopg2. The only file that seems to be missing is libpq.so.5. This file is available in the alpine package libpq available here.
This way we can build psycopg2 and still save practically all the space it used before.
I tried to minimize the number of steps in the dockerfile so the final image is lighter. Adding the appropriate flags to pip and apk we can reduce the amount of space used for cache. Also, declaring a variable for grouping all the build dependencies keeps things cleaner.
Also I defined a more carefully written .dockerignore to save even more space. Using tools like tree can help you find files in your container that aren't necessary.
Based on this fine article, I was able to specify a user for my container that didn't have the ability to modify the container.
This is the dockerfile I ended up with. It went down from 355MB to 135MB which isn't exactly perfect, but is a lot better.
FROM python:3.7-alpine
ENV PATH /usr/local/bin:$PATH
ENV LANG C.UTF-8
ENV USER=prodUser UID=12345 GID=23456
RUN mkdir -p /usr/src/app
COPY requirements.txt /usr/src/app/
RUN buildDeps='gcc python3-dev musl-dev postgresql-dev' \
&& apk update \
&& apk add --no-cache libpq \
&& apk add --virtual temp1 --no-cache $buildDeps \
&& pip install --no-cache-dir -r /usr/src/app/requirements.txt \
&& apk del temp1
COPY . /usr/src/app
WORKDIR /usr/src/app
RUN addgroup --gid "$GID" "$USER" \
&& adduser \
--disabled-password \
--gecos "" \
--ingroup "$USER" \
--uid "$UID" \
"$USER"
USER $USER
EXPOSE 6000
ENTRYPOINT ["python3"]
CMD ["-m", "server"]
I'm still new at working with docker so any advice or changes you suggest are welcomed!