Search code examples
rvirtualboxh2orstudio-server

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix, : Unexpected CURL error: getaddrinfo() thread failed to start


I am experiencing a persistent error while trying to use H2O's h2o.automl function. I am trying to repeatedly run this model. It seems to completely fail after 5 or 10 runs.

Error in .h2o.__checkConnectionHealth() : 
  H2O connection has been severed. Cannot connect to instance at http://localhost:54321/
getaddrinfo() thread failed to start

In addition: There were 13 warnings (use warnings() to see them)
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix,  : 
  Unexpected CURL error: getaddrinfo() thread failed to start

I have updated java in response to: https://h2o-release.s3.amazonaws.com/h2o/rel-wolpert/4/docs-website/h2o-docs/faq/r.html (even though I am using a linux virtual machine). I have added a h2o.removeall() and gc() in response to R h2o server CURL error, kind of repeatable I have not attempted any changes regarding memory because my cluster has 16+ GB and the highest reading I have seen is 1.6 GiB in RStudio.

H2O is running in R/Rstudio Server on an Ubuntu 20.04 virtual machine. Could the virtual box software be blocking something?

The details on my H2O cluster are below:

openjdk version "11.0.11" 2021-04-20
OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.20.04)
OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2.20.04, mixed mode, sharing)

Starting H2O JVM and connecting: ... Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         1 seconds 896 milliseconds 
    H2O cluster timezone:       America/Chicago 
    H2O data parsing timezone:  UTC 
    H2O cluster version:        3.35.0.2 
    H2O cluster version age:    19 hours and 24 minutes  
    H2O cluster name:           H2O_started_from_R_jholderieath_glq667 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   19.84 GB 
    H2O cluster total cores:    12 
    H2O cluster allowed cores:  12 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 
    R Version:                  R version 4.1.1 (2021-08-10) 

Solution

  • I think I also experienced this issue, although on macOS 12.1. I tried to debug it and found out that sometimes I also get another error:

    Unexpected CURL error: Failed to connect to 127.0.0.1 port 54321: Connection reset by peer
    

    I found out that this issue appears only when I have RCurl compiled against curl 7.68.0 and above.

    Downgrading to curl 7.67.0 resolved the issue for me but then I got some issues with RStudio (Segmentation Fault) so I looked into the issue little further.

    And I found out that compiling a recent version of curl with --disable-socketpair solved it for me as well.

    I was monitoring open files and sockets (lsof) and it seems to me that R process runs out of sockets it can create and RCurl then fails with one of those errors. Running gc() in R frequently helps (I called it after every single request) but still the minimum number of open sockets after gc() is slowly but monotonically increasing which leads me to believe there might be some leak. I reported this as a possible bug to the RCurl maintainers.

    For anybody using macOS and homebrew this can be accomplished by running the following:

    $ brew edit curl # add --disable-socketpair to args list
    $ brew install --build-from-source curl # using reinstall might be needed instead of install
    
    $ export RCURL_PATH="usr/local/opt/[email protected]" # can be found using `brew info curl`
    $ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
    $ export LDFLAGS="-L$RCURL_PATH//lib"
    $ export CPPFLAGS="-I$RCURL_PATH/include"
    $ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"
    
    $ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
    $ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl
    
    

    Looking at the curl version in ubuntu 20.04 which is 7.68.0 (according to https://packages.ubuntu.com/focal/curl) I think you won't be able to use the following as the --disable-socketpair was added in curl 7.73.0 but since you are using a virtual machine it might be easier to just use ubuntu 18.04 since it's still supported and is using old enough curl version (7.58.0).

    I haven't used ubuntu for a while but at least I can provide some pseudo-code that should do the same:

    $ sudo apt install devscripts
    $ # make sure source repositories are enabled (uncommented in /etc/apt/s
    $ apt-get source curl
    $ sudo apt-get build-dep curl
    $ cd curl
    $ nano debian/rules # add the --disable-socketpair configure option
    $ dch -i # bump the version
    $ debuild -us -uc -b # build the package
    $ dpkg -i ../curl-some_version.dpkg
    
    $ export PATH="$RCURL_PATH/bin:$PATH" # for curl-config
    $ export LDFLAGS="-L$RCURL_PATH//lib"
    $ export CPPFLAGS="-I$RCURL_PATH/include"
    $ export PKG_CONFIG_PATH="$RCURL_PATH/lib/pkgconfig"
    
    $ R -e "chooseCRANmirror(graphics=FALSE, ind=1);install.packages('RCurl', type = 'source')"
    $ R -e "RCurl::curlVersion()$version" # check if RCurl is using the proper version of curl