Search code examples
pythonrhttpssixreticulate

RKeras "unknown url type: https" error six.urlretrieve (Python Code From R)


TL;DR :-) fetching https urls is working in both python and R, but not when python code is run from R.


While running some code in documentation of package RKeras , I have repeatedly run into the fatal error : "unknown url type: https" The problem originates in Keras which is implemented in Python.

I dug into the problem and found that this was happening when Keras was calling the function urlretrieve in the module six to retrieve data from https url.

I then tested the code in iPython and found it to be working perfectly.

<!-- language: lang-python -->
from six.moves.urllib.request import urlretrieve
urlretrieve(url="https://www.google.com")

Then I try doing the same in R and it fails

<!-- language: lang-r -->
library(reticulate)
py_run_string('from six.moves.urllib.request import urlretrieve')
py_run_string("urlretrieve(url='https://www.google.com')")

However, same thing works in R with plain http

<!-- language:lang-r -->
py_run_string("urlretrieve(url='http://www.google.com')")

For the record, https works fine within my R with packages like httr.

I am totally out of my depth here. What could be happening?

Here is some output about my env

R:

sessionInfo()
# R version 3.4.4 (2018-03-15)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 16.04.4 LTS
# other attached packages:
# [1] reticulate_1.6 kerasR_0.8.0  

And Python (as seen from R)

py_config()

version:        3.5.2 |Anaconda custom (64-bit)| 
(default, Jul  2 2016, 17:53:06)  [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]

Will much appreciate your time and effort.


Edit: Some more debug info in R:

py_run_string("from OpenSSL import SSL")
# ImportError: /xxx/python3.5/lib-dynload/_ssl.so: undefined symbol: SSLv2_method

Solution

  • For the time being, I am posting a partial solution to my particular problem. This is not going to solve the general problem with Reticulate and SSL, but is a decent temporary workaround to help anyone using KerasR or Keras and finding it is not able to download models and databases

    Keras / KerasR uses a cache to avoid future downloads of the same object. In my case, it is ~/.keras/

    So copy the URL of failed download from the console, download the object with the browser and save to keras' cache directory.

    cache for models : ~/.keras/models/
    cache for datasets : ~/.keras/datasets/
    (where ~ is my home directory)
    

    Of course, this is a workaround and I am still looking forward to someone posting a proper solution that works system-wide.