Search code examples
h2oxgboost

XGBoost - H2O crashed due to an illegal memory access


H2O process crashed when doing a Grid Search with XGBoost:

terminate called after throwing an instance of 'thrust::system::system_error'
  what():  /tmp/xgboost/plugin/updater_gpu/src/device_helpers.cuh(387): 
an illegal memory access was encountered

After giving the INFO message below:

08-17 06:44:46.672 10.0.1.89:54321       14426  FJ-1-3    INFO: Checking convergence with logloss metric: 0.04519170911104479 --> 0.02811784326194906 (still improving)
.
08-17 06:44:46.672 10.0.1.89:54321       14426  FJ-1-3    INFO: For grid: final_grid built: 90 models.

The Java exception dumps:

08-17 06:44:46.742 10.0.1.89:54321       14426  #12317-18 INFO: GET /99/Grids/final_grid, parms: {}
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR: java.lang.IllegalArgumentException: Field not found: 'col_sample_rate_change_per_level/_col_sample_rate
_change_per_level' on object hex.tree.xgboost.XGBoostModel$XGBoostParameters@49356589
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at water.util.PojoUtils.getFieldValue(PojoUtils.java:562)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at hex.grid.Grid.createSummaryTable(Grid.java:370)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at hex.schemas.GridSchemaV99.fillFromImpl(GridSchemaV99.java:158)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at water.api.GridsHandler.fetch(GridsHandler.java:41)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at java.lang.reflect.Method.invoke(Method.java:498)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at water.api.Handler.handle(Handler.java:63)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at water.api.RequestServer.serve(RequestServer.java:448)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at water.api.RequestServer.doGeneric(RequestServer.java:297)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at water.api.RequestServer.doGet(RequestServer.java:221)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at water.JettyHTTPD$LoginHandler.handle(JettyHTTPD.java:183)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at water.JettyHTTPD$LoginHandler.handle(JettyHTTPD.java:183)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.Server.handle(Server.java:370)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:49
4)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53
)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:9
71)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpCo
nnection.java:1033)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:         at java.lang.Thread.run(Thread.java:748)
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR: Caught exception:
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR: ERROR MESSAGE:
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR:
08-17 06:44:46.747 10.0.1.89:54321       14426  #12317-18 ERRR: Field not found: 'col_sample_rate_change_per_level/_col_sample_rate_change_per_level' on object hex.tre
e.xgboost.XGBoostModel$XGBoostParameters@49356589

Solution

  • col_sample_rate_per_level is not supported for xgboost, only GBM and random forest: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/col_sample_rate_change_per_level.html

    Here is the list of what you can use with an xgboost grid: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/grid-search.html#xgboost-hyperparameters

    (Of course, it ought to be telling you that, and not crashing, so definitely a bug!)