Search code examples
anacondagoogle-colaboratory

How to install conda on Google Drive for Google Colab?


! wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh
! chmod +x Miniconda3-py37_4.8.2-Linux-x86_64.sh
! bash ./Miniconda3-py37_4.8.2-Linux-x86_64.sh -b -f -p /usr/local
import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')

This cell of code installs conda for my Google Colab. Colab has a time limit on its session, which resets the environment state and data after 8 or 9 hours of active calculation so I need to restart this cell again and again.

Is there a way to install conda and all necessary for me packages on Google Drive ?


Solution

  • This is not a perfect solution, but it may be faster than downloading and building a new conda installation every time. Overview of steps:

    1. Install conda into a local directory on Colab, tarball this directory and store it on Google Drive.
    2. When you start a new Colab notebook or restart an existing one, run a codeblock to fetch the conda install from Google Drive and re-setup the environment.

    1. Create installation of conda and the packages you need (once only)

    Download and install miniconda to /content/miniconda3 directory:

    %env PYTHONPATH=
    ! wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.9.2-Linux-x86_64.sh
    ! chmod +x Miniconda3-py37_4.9.2-Linux-x86_64.sh
    ! bash ./Miniconda3-py37_4.9.2-Linux-x86_64.sh -b -f -p /content/miniconda3
    

    Add miniconda to the system PATH:

    import os
    path = '/content/miniconda3/bin:' + os.environ['PATH']
    %env PATH=$path
    

    Install the conda packages you need (e.g. packagexyz):

    !conda install -c conda-forge packagexyz -y
    

    Optional codeblock: check packagexyz works correctly. This should print the version of the packagexyz and its location within the conda directory:

    import sys
    _ = sys.path.append("/content/miniconda3/lib/python3.7/site-packages")
    import packagexyz
    print(packagexyz.__version__, packagexyz.__file__)
    

    Copy everything over to Google Drive (click the link to fetch the authentication code then paste it into the box):

    from google.colab import drive 
    drive.mount('/content/drive')
    !tar -zcf conda_colab.tar.gz /content/miniconda3
    !cp conda_colab.tar.gz /content/drive/My\ Drive/
    

    2. Copy conda back to Colab (run whenever you restart a notebook)

    Mount Google Drive (requires auth code input again), copy back the conda installation, and re-setup the environment:

    from google.colab import drive 
    drive.mount('/content/drive')
    
    !tar -xf /content/drive/My\ Drive/conda_colab.tar.gz -C ../
    
    import os
    path = '/content/miniconda3/bin:' + os.environ['PATH']
    %env PATH=$path
    %env PYTHONPATH=
    import sys
    _ = sys.path.append("/content/miniconda3/lib/python3.7/site-packages")
    

    Notes

    • This solution is only lightly tested. You may need to set other environment variables depending on the packages you install.
    • Step 2 might be faster by tweaking the tar compression settings. If you have a really big conda installation, consider using apt-get to install pigz at the start of both Step 1 and Step 2, then add --use-compress-program=pigz to the !tar... commands to parallelize the compress & decompress steps.