Search code examples
rdockerjupyter-notebookrstudiovolume

Persistent volumes with docker-compose and RStudio/Jupyter


I am trying to create an RStudio instance and a Jupyter Notebook instance using docker-compose building an image. I can create these instances but the volumes mounted do not seem to be "persistent".

Below is the tree structure of my folders:

.
├── Docker
│   ├── docker-compose.yml
│   └── Dockerfile
└── R_and_Jupyter_scripts
    └── files_already_there.txt

I would like:

1) The files_already_there.txt to be available in my RStudio and Jupyter Notebook instance.

2) That any new file/script created from one of the two instances appears in the R_and_Jupyter_scripts folder with read/write permissions. i.e: I created a textfile "output.txt" from the RStudio instance but when I check the path, here is where the file is saved:

sudo find / -name "output.txt"
/var/lib/docker/overlay2/821d3d087948309e3c489af29a5263e53e5f72627e903b4285a9597214412840/diff/home/maxence/output.txt
/var/lib/docker/overlay2/821d3d087948309e3c489af29a5263e53e5f72627e903b4285a9597214412840/merged/home/maxence/output.txt

Below is my current docker-compose.yml, can you identify what is wrong?

version: "3.5"
services:
  rstudio:
    environment:
      - USER=maxence
      - PASSWORD=password
    image: "rocker/tidyverse:latest"
    build:
     context: ./
     dockerfile: Dockerfile
    volumes:
      - /home/ec2-user/R_and_Jupyter_scripts:/var/lib/docker/
    container_name: rstudio
    ports:
     - 8787:8787

  jupyter:
    image: 'jupyter/datascience-notebook:latest'
    ports:
     - 8888:8888
    volumes:
      - /home/ec2-user/R_and_Jupyter_scripts:/var/lib/docker/
    container_name: jupyter

Thanks a lot.


Solution

  • I don't have your Dockerfile and it looks like you're running an Amazon instance, so I can't reproduce this for certain but if I delete the Dockerfile part and make some other edits I can make something that works on my Ubuntu 18 machine.

    version: "3.5"
    services:
      rstudio:
        environment:
          - USER=maxence
          - PASSWORD=password
        image: "rocker/tidyverse:latest"
        volumes:
          - /tmp/R_and_Jupyter_scripts:/home/maxence/R_and_Jupyter_scripts
        container_name: rstudio
        ports:
          - 8787:8787
    
      jupyter:
        image: 'jupyter/datascience-notebook:latest'
        ports:
          - 8888:8888
        volumes:
          - /tmp/R_and_Jupyter_scripts:/home/jovyan/R_and_Jupyter_scripts
        working_dir: /home/jovyan/R_and_Jupyter_scripts
        container_name: jupyter
    

    I used /tmp/R_and_Jupyter_scripts as the location outside the containers.

    In Rstudio, /home/maxence is created because this corresponds to the USER that is specified and this will be the location that is shown when you log in. I then simply created a directory below this in the volume directive to be the location of the files outside the containers. When you log on to Rstudio, you will see a folder called R_and_Jupyter_scripts that you will be able to enter and create whatever you like (I thought working_dir would work but it always seems always to start in the home folder)

    In the Jupyter image you are using, the default user is jovyan so a folder /home/jovyan will be created automatically. I then use this in the volume and working_dir directives. The Jupyter container will use this working_dir when you log on.

    I tested it and read and write works in all 3 places.

    Using /var/lib/docker will create a volume correctly inside each container, the issue is that there is no way for either client to be able to set their location to it.