Search code examples
javaignite

Simulating apache ignite node failure for unit testing


For unit testing, I'm running multiple Ignite nodes in the same JVM (as discussed in the answer here ).

In my unit tests I want to simulate what happens when an Ignite node unexpectedly dies, e.g. to simulate a sudden hardware failure and the effects of it on the cluster as a whole. I can't just call Ignite.close() because this does an orderly shutdown, and I want a sudden disorderly shutdown, with all the problems it could bring.

I can think of two potential ways to do this:

  1. When I start a new ignite node in my JVM, somehow keep track of all the threads it spawns and then use (the deprecated) method thread.stop to kill all its threads abruptly when I want to simulate it dying. I can't see a way of reliably distinguishing between threads launched by one Ignite in the JVM from threads spawned by another though, unless I assume Ignite starts all its threads at startup only.

  2. Launch a separate process using the Java process API (see first answer here ), start ignite in that process and then kill it by force using Process.destroy() or Process.destroyForcibly().

Does anyone know a better approach?

I'm using Java 8.


Solution

    1. When I start a new ignite node in my JVM, somehow keep track of all the threads it spawns and then use (the deprecated) method thread.stop to kill all its threads abruptly when I want to simulate it dying. I can't see a way of reliably distinguishing between threads launched by one Ignite in the JVM from threads spawned by another though, unless I assume Ignite starts all its threads at startup only.

    If you are testing your application behavior during cluster nodes topology changes, there is no necessity to use private API or inner threads and classes of Ignite. Such tests have been already maintained in Ignite test framework. Moreover, there are extra integration tests based on Ducktape framework (IEP-56), which run Ignite nodes in a separate JVMs by means of Docker.

    I'm not completely sure of your case, but it seems, that you need something like Test Containers.

    For example, such code downloads Docker image and runs Ignite node with fixed port for thin client (unfortunately, thin client can not connect to ephemeral port):

    FixedHostPortGenericContainer<?> ignite = new FixedHostPortGenericContainer<>("apacheignite/ignite")
        .withFixedExposedPort(10800, 10800)
    

    Then, call ignite.stop() to simulate unexpected node failure.

    Moreover, TestContainers includes some useful modules, such as ToxyProxy, which provides possibility to add "network instability": slowing down of requests, resetting the peer, etc.