Flink - Lazy start with operators working during savepoint startup

I am using Apache Flink with RocksDBStateBackend and going through some trouble when the job is restarted using a savepoint.

Apparently, it takes some time for the state to be ready again, but even though the state isn't ready yet, DataStreams from Kafka seems to be moving data around, which causes some invalid misses as the state isn't ready yet for my KeyedProcessFunction.

Is it the expected behavior? I couldn't find anything in the documentation, and apparently, no related configuration.

The ideal for us would be to have the state fully ready to be queried before any data is moved.

For example, this shows that during a deployment, the estimate_num_keys metric was slowly increasing.

However, if we look at an application counter from an operator, they were working during that "warm-up phase".

I found some discussion here Apache flink: Lazy load from save point for RocksDB backend where it was suggested to use Externalized Checkpoints.

I will look into it, but currently, our state isn't too big (~150 GB), so I am not sure if that is the only path to try.

Solution

Starting a Flink job that uses RocksDB from a savepoint is an expensive operation, as all of the state must first be loaded from the savepoint into new RocksDB instances. On the other hand, if you use a retained, incremental checkpoint, then the SST files in that checkpoint can be used directly by RocksDB, leading to must faster start-up times.

But, while it's normal for starting from a savepoint to be expensive, this shouldn't lead to any errors or dropped data.