Search code examples
apache-kafkaapache-kafka-streams

How to remove/clear state stores in Kafka Streams?


I have a custom Transformer implementation in the end of my kafka-streams DSL, with a persistent changelog KeyValueStore bound to it.

Since few weeks I have been putting way too much data in the store. Now whenever I load the application it eats up way too much RAM.

However, the application itself is just a prototype so I don't mind clearing the store entirely.

I could rename the kafka.application.id and the state-store-name but that's a temporary workaround(and the corresponding data/topics won't be deleted).

How do I purge it entirely?


Solution

  • Confluent's documentation recommends to either use KafkaStreams.cleanUp(), or manually delete directories at /var/lib/kafka-streams/<application.id> (configuration parameter state.dir).

    You also need to reset all topics used by application with use of special reset tool - bin/kafka-streams-application-reset:

    bin/kafka-streams-application-reset --application-id my-streams-app \
                                      --input-topics my-input-topic \
                                      --intermediate-topics rekeyed-topic
    

    This post about resetting the state is very interesting.