Search code examples
javaapache-kafka-streams

What is the default WindowBytesStoreSupplier for stream-stream join in Kafka Streams?


The new API has a signature of:

join(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, StreamJoined<K,V,VO> streamJoined)

If I only set store name using StreamJoined<K,V,VO> streamJoined parameter, what would be the default configuration of WindowBytesStoreSupplier in terms of retentionPeriod, windowSize and retainDuplicates? It's not clear from documentation. Is this an in memory state store? Or default one configured by Kafka Streams?


Solution

  • If you only set the store name with StreamJoined then:

    1. The windowSize comes from what you provide the JoinWindows configuration object. Even if you use a custom WindowBytesStoreSupplier, Kafka Streams validates that the supplier window settings match the settings of the provided JoinWindows object. So a JoinWindows.of(Duration.ofSeconds(30)) would have a windowSize equal to 30000 ms.
    2. The retentionPeriod is the window size + the grace period. The default grace period is 24 hours.
    3. The retainDuplicates configuration is true. But even when providing a custom StoreSupplier, the retainDuplicates field must be set to true.
    4. The store type is a persistent (RocksDB) store, configured by Kafka Streams, as described above.

    Also, note that with StreamJoined, you can now provide your own StoreSupplier for both sides of the join, so it's possible to have in-memory stores.

    HTH,

    Bill