Search code examples
google-cloud-datalab

How to add a git .ignore to a datalab notebook in Google cloud Platform?


I am using ungit with datalab notebooks on google cloud platform. I would like to ignore directories which save model data. How to do this? I don't see any ui entries to do this.


Solution

  • Using this guide there is a section on how to ssh to the instance. In order to do that you need to start the compute instance associated with the datalab notebook. You can use the console to start it or simply connect to your datalab notebook. Either way will start the compute engine so you can ssh to the instance.

    1. Do the ssh command listed in step one of the guide.

    2. Do the docker command listed in step two of the guide.

    3. Open the interactive shell in the container listed in step three of the guide.

    4. Change directory to the root of the notebooks git repo as listed in step four of the guide.

    5. An editor is not included. I was able to apt-get install vim.

    Edit the .gitignore file. It should have some entries already there. My code is in top-level directory named mine and my models are in model_trained so I added model_trained to the gitignore file. Without the leading or trailing directory slash it matches the model output dirs where ever they appear in the git file system.

    The is the resulting .gitignore.

    root@b28d8cf57173:~/datalab/notebooks# cat .gitignore 
    .ipynb_checkpoints
    *.pyc
    model_trained
    

    Afterwards, I trained the model, and checked with ungit.