Search code examples
javarh2o

H2O h2o.importFile Error: 'Cannot determine file type. for nfs://.../model.zip', caused by water.parser.ParseDataset$H2OParse


I am trying to import a h2o model as a .zip file exporter as POJO with R. The following error is all I get:

model_file <- "/Users/bernardo/Desktop/DRF_1_AutoML_20190816_133251.zip"
m <- h2o.importFile(model_file)
Error: DistributedException from localhost/127.0.0.1:54321: 'Cannot determine file type. for nfs://Users/bernardo/Desktop/DRF_1_AutoML_20190816_133251.zip', caused by water.parser.ParseDataset$H2OParseException: Cannot determine file type. for nfs://Users/bernardo/Desktop/DRF_1_AutoML_20190816_133251.zip

I already ran file.exists(model_file) and that returns TRUE, so the file exists. Did the same with normalizePath(model_file) and same result. When I try to import it into my R session, it seems that h2o finds the file but can't import it for some reason.

Here's my R Session info:

R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] h2o_3.26.0.2      lares_4.7         data.table_1.12.2 lubridate_1.7.4   forcats_0.4.0    
 [6] stringr_1.4.0     dplyr_0.8.3       purrr_0.3.2       readr_1.3.1       tidyr_0.8.3      
[11] tibble_2.1.3      ggplot2_3.2.1     tidyverse_1.2.1  

Hope you guys can help me import my POJO model into R. Thanks!


Solution

  • Ok, I actually found the solution I needed. The trick is to convert your dataframe (df) to json format, and then use the .zip file generated with h2o to predict using the h2o.predict_json instead of h2o.mojo_predict_df. I think it's pretty straight forward and less complicated. At least it worked as I needed it to work.

    library(jsonlite)
    library(h2o)
    json <- toJSON(df)
    output <- h2o.predict_json(zip_directory, json) 
    

    NOTE: No need to unzip the zip file.

    If by any chance you've used the lares package, simply use the h2o_predict_MOJO function.

    Hope it helps any other people trying to achieve the same result.