Search code examples
pythongoogle-cloud-platformanacondajupyter-labcudf

How to install library in the google plat form - ai platform - notebook instance


I currently a data science undergraduate student and try to use google could platform - AI platform - notebook instance to do data science project. The following image shows what I am talking about.


enter image description here


enter image description here


I have no problem running the instance and use it to manipulate the data. However, since I want to use cudf library to accelerate the data processing speed, I need to install that library. By searching from internet, I have tried: First, I opened the terminal:


enter image description here


then I tried the following command and got the errors:


enter image description here


the command is from this website 2.Then I tried use the anaconda way to install it, use the method from the same websiteI type in the following command in the terminal and got the 'UnsatisfiableError'.


enter image description here


From the above, you can see I have tried both 10.0 version and 9.42 version, but both of them does not work. 3. Then I also tried use the method from this website. I typed the following command conda install -c nvidia -c rapidsai -c numba -c conda-forge -c defaults cudf=0.8 python=3.6 cudatoolkit=9.2 on the terminal, the result is very long so I just show last part:


enter image description here


As you can see, this time the installation is successful. But then when I open a new note book and import the 'cudf' library following error appears:


enter image description here


it said there is no such library, but I just installed the library.

I am really appreciate anyone who could solve this problem for me as I have been struggling on this for 7 hours.


Solution

  • In order to install dependencies in an AI platform notebook you should indeed use pip (have a look at this link). However, it seems that RAPIDS is not working anymore with PIP installation packages and therefore the proper approach to install it is by creating a Docker image or installing it using conda. In their GitHub Library you can see more information on how to do it.

    In this link you will find the Rapids Docker repository where you can pull a Docker image with the required dependencies to run cuDF and here you will find the steps to create an AI Platform Notebook instance using a custom container.

    As AI platform notebooks is in Beta stage, it is possible that the functionality of installing conda packages is not yet available.

    As a workaround I suggest you to install cuDF packages in a notebook running ‘PyTorch’ environment instead of the ‘Numpy/SciPy/scikit-learn’ that you are using as its default pip version is older (19.1.1). To do so you can check this link and see how to set the ‘PyTorch’ environment.