postgresql docker dockerfile docker-volume

Postgres Dockerfile exploration - VOLUME statement usage

I am looking at sample dockerfile to see how VOLUME is used , I come across the following lines from - https://github.com/docker-library/postgres/blob/master/Dockerfile-alpine.template

ENV PGDATA /var/lib/postgresql/data
# this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)
RUN mkdir -p "$PGDATA" && chown -R postgres:postgres "$PGDATA" && chmod 777 "$PGDATA"
VOLUME /var/lib/postgresql/data

What is the purpose of using a volume here , here is my understanding - please confirm

Create directory pointed by $PGDATA in image file system.
Map it with the VOLUME so that any content created later as part of populating the content thorough docker-entrypont.sh by exposing a predefined directory that could be used by the container.

What if the VOLUME instr is not defined ? It might more laborious for someone to figure out where to keep custom changes unless VOLUME is not defined

Solution

Volume is define here, so when you start a container ( out of this image ) a new anonymous volume is created.

The volume will hold your sensible data in this regard, so this is all you need to "persist" during normal/soft docker image lifecycled.

Usually when the maintainers of docker images are already aware where the data, which will be sensible to keep, is located ( like here ) there will decorate the folder using VOLUME in the Dockerfile. This will, as mentioned, create a anon-volume during runtime but also makes you aware ( using docker inspect or reading the Dockerfile ) where volumes for persistence are located.

In production you usually will used a named volume / path mount in your docker-compose file mounted to this very folder

docker-compose.yml as named volume

volumes:
  mydbdata:/var/lib/postgresql/data

docker-compose.yml as path

volumes:
  ./local/path/data:/var/lib/postgresql/data

There are actually cons in defining such VOLUME definitions in the Dockerfile, which i will not elaborate here, but the main reason is "lifetime".

Having no VOLUME in the Dockerfile and running

docker-compose up -d
# do something, manipulate the data
docker-compose down

# all your data would be lost when starting again
docker-compose up -d

Would remove not only the running container, but all your DB data, which might not what you intended ( you just wanted to recreated the container ).

With VOLUME in the Dockerfile, the anon-volume would be persisted even over docker-compose down