How to avoid K8s killing pod in a large migration

We are in middle of a migration project where we want to migrate millions of documents from one mongo collection to another (or in certain cases add a new field to existing documents). During testing we are seeing that when the operation takes longer than 10 minutes (which is expected), K8s kills the pod since the health check did not pass for the pod.

Am I missing something? Are there any best practises to follow while doing such large migrations. We expect the migration to go as long as 3-4 hours in a single stretch.

Solution

Regarding your question about how to run Mongock asynchronously to achieve this.

Regardless of if you are using springboot or not, you need the following

Using the builder
Run Mongock in a thread with something like this new Thread(mongockRunner::execute).start()
Create an event listener for the Mongock events: success and failure. Ideally the start event as well, to indicate Mongock process started, although it's not mandatory. Below I will add the code for both cases, standalone and springboot
Have two different endpoints for your liveness and readiness probes

The above is the foundation, with this you can have multiple approaches. I would go with the approach of having an shared object(MongockStateTracker) containing the Mongock state, which will be updated from the listeners.

The MongockStateTracker class would look like something like this:

public class MongockStateTracker {
    public enum State {
        NOT_STARTED, RUNNING, FINISHED_OK, FINISHED_FAILED
    }

    private State state = State.NOT_STARTED;

    public synchronized void setFinishedOk() {
        state = State.FINISHED_OK;
    }

    public synchronized void setFinishedFailed() {
        state = State.FINISHED_FAILED;
    }

    public synchronized void setRunning() {
        state = State.RUNNING;
    }

    public State getState() {
        return state;
    }

    public Boolean isNotFinished() {
        return getState() == State.NOT_STARTED || getState() == State.RUNNING;
    }
}

Then you need to have this logic for the liveness probe endpoint, which should be something like this(please be aware you need to inject the shared instance of MongockStateTracker)

if(stateTracker.isNotFinished() || stateTracker.getState() == MongockStateTracker.State.FINISHED_OK) {
           return "ALIVE";
       } else {
           return "NOT ALIVE";
       }

Then you need to have this logic for the readiness probe endpoint, which should be something like this(please be aware you need to inject the shared instance of MongockStateTracker)

if(stateTracker.getState() == MongockStateTracker.State.FINISHED_OK) {
            return "READY";
        } else {
            return "NOT READY";
        }

If you are using Spring boot, you the best way is to inject a ApplicationRunner bean to execute Mongock. It should be something like this

@Bean
    public ApplicationRunner mongockApplicationRunner(ApplicationContext springContext,
                                                      MongoTemplate mongoTemplate) {

        MongockRunner mongockRunner = MongockSpringboot.builder()
                .addMigrationScanPackage("YOUR_MIGRATION_PACKAGE_PATH")
                .setEventPublisher(springContext)
                .setSpringContext(springContext)
                //more setters
                .buildRunner();

        return args -> new Thread(mongockRunner::execute).start();

    }

...And finally you need to update the MongockStateTracker's state. I will show the code for both, standalone and springboot

Standalone: You need to add the consumers in the builder itself

MongockStandalone.builder()
//...more setters
                .setMigrationStartedListener(startedEvent -> stateTracker.setRunning())
                .setMigrationSuccessListener(successEvent -> stateTracker.setFinishedOk())
                .setMigrationFailureListener(failEvent -> stateTracker.setFinishedFailed());

In case you are using sping boot, you need to inject a bean for each listener like the following


    @Bean
    public ApplicationListener<SpringMigrationFailureEvent> failureEventListener() {
        return event -> stateTracker.setFinishedFailed();
    }

    @Bean
    public ApplicationListener<SpringMigrationSuccessEvent> successEventEventListener() {
        return event -> stateTracker.setFinishedOk();
    }

    @Bean
    public ApplicationListener<SpringMigrationStartedEvent> startedEventListener() {
        return event -> stateTracker.setRunning();
    }

That should do the work ;)