Search code examples
javaapache-kafkaapache-kafka-streams

Kafka Streams KTable-KTable foreign key join emits null even if right side is empty


What is the semantics for a Kafka Streams (3.7.1) KTable-KTable foreign key join, where the extracted foreign key has never matched against the primary key in the right-side ktable?

In this example the right-side is empty and nothing matches.

        ...
        KTable<String, String> personsWithNiceName = person.join(niceName,
            name -> name,
            (name, nice) -> nice + " " +name))
        .toStream().to("Result");

When writing the first message, nothing matches, and nothing is emitted. This aligns with my expectations.

        inputPersonTopic.pipeInput("1", "Jane");

        outputTopic.readKeyValuesToList().forEach((rec) -> {
            System.out.println("Key: " + rec.key + " Value: " + rec.value);
        });

// nothing is emitted

However; writing any message for the same key again, a tombstone is emitted. This made me a bit sad.

        inputPersonTopic.pipeInput("1", "Jane");

        outputTopic.readKeyValuesToList().forEach((rec) -> {
            System.out.println("Key: " + rec.key + " Value: " + rec.value);
        });

//Key: 1 Value: null

Is this expected? Since there is no previous value and there is nothing to delete, I rather hoped also subsequent values would be suppressed too.

In a real world example, the tombstones cascades down the topology and into the consumers, and cause quite much ado about nothing.

Should I try to suppress this? If so, what is the most efficient way? I can only think of a processor with a store for "seen" entries, that just ignores tombstones for unseen keys.


Solution

  • It's a known bug you are hitting: https://issues.apache.org/jira/browse/KAFKA-16394

    It's strictly not incorrect, as it's an "idempotent tombstone", so your downstream consumers should be able to handle it correct. Of course, it's undesired, and unnecessary downstream load.

    If the load is really a problem, your idea to suppress these tombstones might work, but it also does sound expensive to maintain an additional store. So you put the overhead just somewhere else (not sure if you would gain much overall?).