Fault Tolerance in Ignite Services

I have deployed ignite services()(some job to be performed on ignite nodes) on 4 ignite nodes. If one node fails and job is in midway of execution, I want other node to take over this job and continue executing it. How can we handle this type of failure in case of service grids? I have read about fail over SPI and check pointing SPI, which of these can be used in my case? are there any examples for same?

Thanks.

Solution

Failover, checkpointing, etc. are parts of Compute Grid [1]. And from what I hear, it fits your use case much better than Service Grid.

[1] https://apacheignite.readme.io/docs/compute-grid