Search code examples
apache-flinkflink-streaming

How to update a job in flink without loosing the state via UI and some backend state questions


I have the following code :

val stateUri = "file:///tmp/"

new RocksDBStateBackend(stateUri, true)
  1. when I deploy new version of my job via ui what should I do in order to keep the state ?
  2. Is it just enough to put the stateUri in the savepoint path ?
  3. If I want to scale it out can I deploy the same jar again with the same path ?
  4. what will happen if two different jars will have the same backend stateUri ?

Solution

    1. In order to upgrade your job, you first should take a savepoint via bin/flink savepoint <JOB_ID> <TARGET_DIRECTORY>. Alternatively, you can also cancel the job with a savepoint which creates a savepoint and stops the job bin/flink cancel --withSavepoint <TARGET_DIRECTORY> <JOB_ID>. Both CLI calls will return a path to the created savepoint which should be stored under your TARGET_DIRECTORY. In order to resume from this savepoint, you should enter this path into the Savepoint Path field in the UI or submit a job via bin/flink run --fromSavepoint <SAVEPOINT_PATH> <JAR>.
    2. No, the stateUri is only the base path for the state backend where it stores the checkpoint. The state backend will create a sub directory with the id of the job under which it will store the checkpoints. Thus the path of a checkpoint usually looks like stateUri/JOB_ID/chk-1 where JOB_ID is a UUID (e.g. 0ba86fd9d1b29d90796e4a7d27f9b2f9) for the first checkpoint.
    3. In order to scale out a job, you should take a savepoint, cancel the job and resubmit the job resuming from the savepoint and an increased parallelism (e.g. bin/flink run --fromSavepoint <SAVEPOINT_PATH> --parallelism 10 <JAR>).
    4. Every job will have a unique job id. Therefore, you will find two sub directories under stateUri which are the different job ids. Checkpoints will be stored in each sub directory for each of the jobs separately.