Search code examples
apache-flinkflink-streaming

Is there a way to define a Flink POJO type which recursively references itself?


I'm trying to store a tree structure in state, but am having some difficulty creating a custom TypeInformation without the self-reference being created as a generic type.

For example, my model would look something like:

public class Node {
    private Map<String, Node> nodeMap;
}

I see that Flink supports recursive types for Avro state, but I can't seem to find a workaround for POJOs.


Solution

  • After digging into the Flink serialization code (as of 1.11), it appears that when Flink detects recursion, it will fallback to Kryo for all descendants of the root node.

    The workaround is to store your nodes in a flattened data structure such as a HashMap (or Flink's native MapState), keyed by some sort of nodeId, and then use that reference in your tree and use it to perform look-ups of the actual node.