docker kubernetes docker-compose kubernetes-statefulset

What is the purpose of mounting volumes that contain already mounted volumes?

I am looking at a docker-compose file for the Eramba project, the volumes part looks like this:

    volumes:
      - data:/var/www/eramba/app/upgrade/data
      - app:/var/www/eramba
      - logs:/var/www/eramba/app/upgrade/logs

Why do they mount data and logs when app (which is the parent dir) is also mounted? To me it seems to be duplicates in some way when doing this?

I am trying to translate this to a kubernetes config, where these paths should be mounted as volumes through volumeClaimTemplates. Not sure if I can just mount /var/www/eramba and be done with it or if I also have to mount the other ones?

Solution

Internally, Compose sorts the volumes by mount points before it mounts them. This causes the "outer" volume to be mounted first. So the filesystem layout will roughly look like

/                               image
+-- var/                        image
    +-- www/                    image
        +-- eramba/             app named volume
            +-- app/            app named volume
                +-- upgrade/    app named volume
                    +-- data/   data named volume
                    +-- logs/   logs named volume

I'm guessing that the volume mount over the entire /var/www/eramba directory tree is a mistake, and you should delete it (even in your existing Compose file). Docker has a feature that image content is copied into named volumes and there are setups that try to use this, but this only works for Docker named volumes (it doesn't happen in Kubernetes) and it only works if the volume is completely empty (it ignores upgrades in the original image). I'd be concerned that trying to replicate this volume mount in Kubernetes would cause the actual application to be lost, and replaced with the empty contents of an uninitialized persistent volume.

If it's possible to restructure or reconfigure the application, also consider changing it so that it writes logs to its stdout, rather than a directory. This will make logs accessible via docker logs or kubectl logs, and you won't have to manually copy the log files out to read them.

Similarly, if it's possible to rearrange the application so that it stores all of the data in an external database (maybe in a separate StatefulSet) then you won't need any volume mounts in the container at all, and you can very easily use a Deployment's replicas: field to run multiple copies of it. This is often a larger change, though, and it may not be practical.