i would like to have more details about the configuration of apache ignite (cluster igfs) and hdfs. I can't find any official reference, so i doubt that is possibile to do so with the opensource version of apache ignite and i need to switch to something like gridgain. Is that true? I would like to use apache ignite to perform in-memory computation with spark and i would like to have a "kind of automatic" sync with hadoop hdfs as backend storage, because i don't want to perform any manual load from hdfs.
thanks
You can still use Apache Ignite's integration with Spark to work with HDFS:
There are currently integrations for Spark 2.3, 2.4 and 3.0. The latter was added not so long ago, for some reason it is not in the documentation. But it's here:
https://downloads.apache.org/ignite/ignite-extensions/ignite-spark-ext/3.0.0/
Anyway, you can also check my webinar about this integration:
https://www.youtube.com/watch?v=lkRh2TO8VSU
Also you can see the examples here: