Search code examples
apache-kafkaapache-kafka-streamsstatefulrocksdb

Adding data to state store for stateful processing and fault tolerance


I have a microservice that perform some stateful processing. The application construct a KStream from an input topic, do some stateful processing then write data into the output topic.

I will be running 3 of this applications in the same group. There are 3 parameters that I need to store in the event when the microservice goes down, the microservice that takes over can query the shared statestore and continue where the crashed service left off.

I am thinking of pushing these 3 parameters into a statestore and query the data when the other microservice takes over. From my research, I have seen a lot of example when people perform event counting using state store but that's not exactly what I want, does anyone know an example or what is the right approach for this problem?


Solution

  • So you want to do 2 things:

    a. the service going down have to store the parameters:
    If you want to do it in a straightforward way than all you have to do is to write a message in the topic associated with the state store (the one you are reading with a KTable). Use the Kafka Producer API or a KStream (could be kTable.toStream()) to do it and that's it.

    Otherwise you could create manually a state store:

    // take these serde as just an example
    Serde<String> keySerde = Serdes.String();
    Serde<String> valueSerde = Serdes.String();
    KeyValueBytesStoreSupplier storeSupplier = inMemoryKeyValueStore(stateStoreName);
    streamsBuilder.addStateStore(Stores.keyValueStoreBuilder(storeSupplier, keySerde, valueSerde));
    

    then use it in a transformer or processor to add items to it; you'll have to declare this in the transformer/processor:

    // depending on the serde above you might have something else then String
    private KeyValueStore<String, String> stateStore;
    

    and initialize the stateStore variable:

    @Override
    public void init(ProcessorContext context) {
      stateStore = (KeyValueStore<String, String>) context.getStateStore(stateStoreName);
    }
    

    and later use the stateStore variable:

    @Override
    public KeyValue<String, String> transform(String key, String value) {
      // using stateStore among other actions you might take here
      stateStore.put(key, processedValue);
    }
    

    b. read the parameters in the service taking over:
    You could do it with a Kafka consumer but with Kafka Streams you first have to make the store available; the easiest way to do it is by creating a KTable; then you have to get the queryable store name that is automatically created with the KTable; then you have to actually get access to the store; then you extract a record value from the store (i.e. a parameter value by its key).

    // this example is a modified copy of KTable javadocs example
    final StreamsBuilder streamsBuilder = new StreamsBuilder();
    
    // Creating a KTable over the topic containing your parameters a store shall automatically be created.
    //
    // The serde for your MyParametersClassType could be 
    // new org.springframework.kafka.support.serializer.JsonSerde(MyParametersClassType.class) 
    // though further configurations might be necessary here - e.g. setting the trusted packages for the ObjectMapper behind JsonSerde.
    //
    // If the parameter-value class is a String then you could use Serdes.String() instead of a MyParametersClassType serde.
    final KTable paramsTable = streamsBuilder.table("parametersTopicName", Consumed.with(Serdes.String(), <<your InstanceOfMyParametersClassType serde>>));
    
    ...
    // see the example from KafkaStreams javadocs for more KafkaStreams related details
    final KafkaStreams streams = ...;
    streams.start()
    ...
    
    // get the queryable store name that is automatically created with the KTable
    final String queryableStoreName = paramsTable.queryableStoreName();
    // get access to the store
    ReadOnlyKeyValueStore view = streams.store(queryableStoreName, QueryableStoreTypes.timestampedKeyValueStore());
    // extract a record value from the store
    InstanceOfMyParametersClassType parameter = view.get(key);