Search code examples
apache-kafkaapache-flinkflink-streaming

how offsets kept in savepoint


We are using kafka as source of our pipeline. I want to take the existing state from production environment to a new environment. My question is what happens with the offset in the new environment ? since we took the savepoint from production and the offsets are kept in the savepoint does it mean that in the new environment the job will start consuming messages with the offsets from production or it will actually starts with a new ones like as new consumer ?


Solution

  • The offsets in the new job will start from those stored in the savepoint, provided you restart the new job from the savepoint, like this:

    $ bin/flink run -s :savepointPath [:runArgs]
    

    Relevant docs include the last paragraph of this section about Kafka Consumers Start Position Configuration, which states

    Note that these start position configuration methods do not affect the start position when the job is automatically restored from a failure or manually restored using a savepoint. On restore, the start position of each Kafka partition is determined by the offsets stored in the savepoint or checkpoint ...

    as well as this section about Resuming from Savepoints.