Search code examples
apache-flink

org.apache.flink.runtime.checkpoint.CheckpointException: Some tasks of the job have already finished


I want to stop a flink task by rest api, and I send request: http://192.168.215.165:8081/jobs/c952ba860604a2c32a7abb9eb5b42b0d/stop ,then I got resoponse :

 {
"request-id": "29c559399243c817055ebbaf7431a8d2"
}

And then I send request: http://192.168.215.165:8081/jobs/c952ba860604a2c32a7abb9eb5b42b0d/savepoints/29c559399243c817055ebbaf7431a8d2, I got the response:(part of)

{
    "status": {
        "id": "COMPLETED"
    },
    "operation": {
        "failure-cause": {
            "class": "java.util.concurrent.CompletionException",
            "stack-trace": "java.util.concurrent.CompletionException: org.apache.flink.runtime.checkpoint.CheckpointException: Some tasks of the job have already finished and checkpointing with finished tasks is not enabled. Failure reason: Not all required tasks are currently running.\n\tat java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)\n\tat java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)\n\tat java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:925)\n\tat java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:913)\n\tat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)\n\tat java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)\n\tat org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:246)\n\tat java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)\n\tat java.util.concurrent. 

How can I stop a flink job by rest API please ?


Solution

  • A few alternatives:

    • set execution.checkpointing.checkpoints-after-tasks-finish.enabled: true in your configuration (this is a somewhat experimental feature that was added in 1.14, but it should work)
    • modify your job so that all of the tasks are still running at the time when the job is ready to be stopped
    • terminate the job without taking a savepoint