Search code examples
multithreadingapache-kafka-streams

Is the org.apache.kafka.streams.KafkaStreams#store method thread safe?


In my Kafka Streams application, I have 2 threads below:

  • Thread A: this creates a Topology object including state stores and everything and then eventually calls the constructor of the KafkaStreams class and the start() method.
  • Thread B: this has a reference to the KafkaStreams object the thread A created. This periodically calls KafkaStreams#store on the object, gets a ReadOnlyWindowStore instance and reads the data in the store for monitoring purposes.

I'm wondering if what my app does is ok in terms of thread safeness. I'm not so worried about ReadOnlyWindowStore because the javadoc (link) says:

Implementations should be thread-safe as concurrent reads and writes are expected.

But as for KafkaStreams#store, I'm not so sure if it is ok to call from separate threads. One thing which concerns me is that it touches a HashMap, which is not thread safe here. A KafkaStreams#removeStreamThread() call can mutate this HashMap object. Given that, I'm not so sure if this is designed to be thread-safe.

My questions here: is it ok to call KafkaStreams#store from a thread which is different from the one which instantiated the KafkaStreams object? Or would that be better to call the store() method in the same thread and share only the ReadOnlyWindowStore instance with other threads? Is the KafkaStream class designed to be thread-safe at all?


Solution

  • It turned out that actually this is not thread safe as of Kafka Streams 3.6.1. More details can be found in https://issues.apache.org/jira/browse/KAFKA-16055