Search code examples
rkaggletensorflow-probabilitygreta

How to install tensorflow-probability in kaggle kernel for R language


I need to install tensorflow-probability in kaggle kernel for R language.

I tried using the following code,

library(tensorflow)
install_tensorflow(extra_packages = "tensorflow-probability")

Which seems to only install tensorflow, as when I try to load package greta, which depends on tensorflow-probability, I get the following error,

Error: 

greta requires TensorFlow (>=1.10.0) and Tensorflow Probability (>=0.5.0), but TensorFlow Probability isn't installed. Use:

install_tensorflow(extra_packages = "tensorflow-probability") 
to install the latest version.

Also tried installing it through custom packages option, which shows it as installed, yet greta mentions it as being not installed.


Solution

  • The key problem is that the preinstalled r-tensorflow virtual environment is not in a default location, which prevents the install_tensorflow() method from editing it. To resolve this, one first must set the WORKON_HOME environment variable that Reticulate uses to identify the root of the virtualenv environments. I was able to get a proper installation along the following lines:

    # set virtualenv root to where 'r-tensorflow' env is located
    Sys.setenv(WORKON_HOME="/root/.virtualenvs")
    
    # install greta
    install.packages("greta")
    
    # install tfp
    tensorflow::install_tensorflow(envname="r-tensorflow", extra_packages=c("tensorflow-probability==0.3.0"))
    
    # check that TFP is installed in the env
    dir("/root/.virtualenvs/r-tensorflow/lib/python2.7/site-packages")
    ## ...
    ## [56] "tensorflow"                            
    ## [57] "tensorflow_probability"                
    ## [58] "tensorflow_probability-0.3.0.dist-info"
    ## [59] "tensorflow-1.10.0.dist-info"
    ## ...
    

    Along these lines, I made a public Kaggle kernel available that runs the default Greta example.

    The above code results in installing Greta v0.3.0, TF 1.10.0, and TFP 0.3.0, which is the correct version matching. I was also able to install the latest versions using

    # set virtualenv root to where 'r-tensorflow' env is located
    Sys.setenv(WORKON_HOME="/root/.virtualenvs")
    
    # install latest greta
    devtools::install_github("greta-dev/greta")
    
    # install tfp
    tensorflow::install_tensorflow(envname="r-tensorflow", version="1.13.1", extra_packages=c("tensorflow-probability==0.6.0"))
    

    which also gets library(greta) to launch without complaint. However, it crashed during sampling, with a complaint about the assertthat package being corrupted. Note that assertthat gets updated as part of the Greta install from GitHub, which is why I ended up using the CRAN version.

    Hopefully in the future Kaggle just includes TFP and one won't have to deal with this mess.