Search code examples
h2ignite

Ignite Query / SqlQuery / SqlFieldsQuery: are "lazy" and "page size" actually doing anything?


Some background: we are building a machine learning farm off Ignite cluster. Part of the use case is generating training data sets, which are gigantic matrices (up to billions of rows x thousands of columns in theory), one row per data entry in Ignite cache.

We are using SqlQuery to fetch the records matching a predicate locally on each node, then iterate over records, generate vectors, write them into an external storage for further consumption. Each node exports data independently, so for 32 nodes we end up with 32 exported data sets.

The problem: small data set generations worked OK, but larger data set generations (queries expected to return 10M+ rows per node) basically kill the entire cluster, blowing out the nodes due to OOME and GC hell. We looked at "Performance and Debugging" section of Ignite docs (https://apacheignite-sql.readme.io/docs/performance-and-debugging#result-set-lazy-load), tried lazy result set and page size settings. Nope.

The investigations (profiling, memory dumps, debugger, etc), suggested that result sets of the queries are loaded into memory completely before our code gets to read the first row, even though we are using QueryCursor and iterations. The Query#pageSize and SqlFieldsQuery#setLazy do not seem to have any effect on that whatsoever. Digging deeper, it turned out that H2 (that Ignite indexing is using) does not support server-side cursor in result sets at all, and can only be configured to buffer the records onto disk (even with SSDs, this is a performance non-starter). Reading through Ignite source code suggested that Query#pageSize and SqlFieldsQuery#setLazy are only used in distributed queries / scan queries, and still Ignite does full reads of result sets on the nodes into memory.

Sigh. These are remediations we thought of:

  • Literally run thousands of tiny Ignite nodes with a lots of unused memory, as opposed to dozens of big ones. Looks really cost-ineffective.
  • Use h2.maxMemoryRows set to some small value. Seems like a silly solution (memory-to-memory in the same JVM through a buffer on disk? really?).
  • Ditch the SQLQuery / SQLFieldsQuery, bypass H2, use ScanQuery. This is a ton of work (have to parse predicate and compile them into IgniteBiPredicate, plus these are full table scans, therefore no index / optimization, which sort makes the whole ordeal pointless).
  • Presuade Thomas Mueller to do server-side cursors in H2 somehow (see How to set h2 to stream resultset?)?
  • Allocate gigantic ram drives on each ignite node to buffer H2 record sets there (alright, this got weird, I'd probably stop).

The question: is our assessment correct and query result streaming is not a thing for local queries? Are there any more sensible workarounds?

Would appreciate some insight from Ignite comrades if they read this. Thanks in advance!

UPDATE: in case it matters, the cluster is 8 nodes as follows:

Architecture: x86_64 CPU(s): 32 Model name: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz CPU MHz: 2345.904 BogoMIPS: 4600.05 Hypervisor vendor: Xen Virtualization type: full RAM: 240Gb

These are designated EC2-s with ephemeral volumes for Ignite /data mount.


Solution

  • Updating this after seemingly solving the issue.

    Note: the question was about local queries.

    It looks like using SqlFieldsQuery with 'lazy' helped for local queries. Local SqlQuery seems to be ignoring the lazy flag.