Search code examples
pythongoogle-cloud-platformjupyter-notebookgoogle-cloud-datalabfast-ai

GCP and Datalab with Python 3.6, Need to use Jupyter on GCP


I'm trying to use fastai's libraries, but some of the data accessing tools built in to those libraries are dependent on HTML objects. For example, the DataBunch.show_batch attribute produces an HTML object that is easiest to use in Jupyter. I need to run my testing on GCP (or another cloud), and here are the issues:

  • fastai has some libraries that are dependent on Python3.6 or greater (new string format)
  • GCP doesn't have a good way to interface with Jupyter NBs. I had it set up with some trouble, but then my computer needed reformat, and now I am questioning if I should set up again. The previous method was largely based on this.
  • GCP apparently has something meant to provide an interface between it and Jupyter NBs, called Datalab. However, Datalab does not support Py3.6 or greater, per this link.

I see a few options:

  1. Develop my own data visualization techniques by subclassing fastai's libraries and skip Jupyter altogether
  2. Create a Jupyter-to-GCP interface in a different way, basically redoing the steps in link in the second bullet point above.
  3. Use one of the containers (docker) that I keep hearing about on Datalab that allow me to use my own version of Python

Does anyone have other options for how I can make this connection? If not, can anyone provide other links for how to accomplish 1, 2, or 3?


Solution

  • You can follow this guide from fast.ai to create a VM with all the required libraries pre-installed. Then, following the same guide you can access JupyterLab or Jupyter Notebooks in this VM. It's simple, fast and comes with Python 3.7.3.