Search code examples
apache-flink

Example of raw vs managed state


I am trying to understand the difference between raw and managed state. From the docs:

Keyed State and Operator State exist in two forms: managed and raw.

Managed State is represented in data structures controlled by the Flink runtime, such as internal hash tables, or RocksDB. Examples are “ValueState”, “ListState”, etc. Flink’s runtime encodes the states and writes them into the checkpoints.

Raw State is state that operators keep in their own data structures. When checkpointed, they only write a sequence of bytes into the checkpoint. Flink knows nothing about the state’s data structures and sees only the raw bytes.

However, I have not found any example highlighting the difference. Can anyone provide a minimal example to make the difference clear in code?


Solution

  • Operator state is only used in Operator API which is intended only for power users and it's not as stable as the end-user APIs, which is why we rarely advertise it. As an example, consider AbstractUdfStreamOperator, which represents an operator with an UDF. For checkpointing, the state of the UDF needs to be saved and on recovery restored.

    @Override
    public void snapshotState(StateSnapshotContext context) throws Exception {
        super.snapshotState(context);
        StreamingFunctionUtils.snapshotFunctionState(context, getOperatorStateBackend(), userFunction);
    }
    
    @Override
    public void initializeState(StateInitializationContext context) throws Exception {
        super.initializeState(context);
        StreamingFunctionUtils.restoreFunctionState(context, userFunction);
    }
    

    At this point, the state could be serialized as just a byte blob. As long as the operator can restore the state by itself, the state can take an arbitrary shape.

    However, coincidentally in the past, much of the operator states have also been (re-)implemented as managed state. So the line is more blurry in reality.