we have a simple command id deduplication mechanism based on Kafka Streams State Stores. It stores command ids from the past hour in persistentWindowStore. We encountered a problem with the store if there's an exception thrown later in that topology. We run 3 nodes using docker, each with multiple threads set for this particular Streams Application. The flow is as follows:
2020-04-16 07:26:19.144 DEBUG 1 --- [-StreamThread-1] c.g.f.c.s.validation.CommandIdValidator : CommandId: mnl_cmd_1cd3a7c9-56ef-4c08-a312-d9b69681726a is not a duplicate.
2020-04-16 07:26:19.144 INFO 1 --- [-StreamThread-1] c.g.f.c.s.v.CommandIdValidationUtils : Putting commandId in store: mnl_cmd_1cd3a7c9-56ef-4c08-a312-d9b69681726a 1587013200000
2020-04-16 07:26:19.153 INFO 1 --- [-StreamThread-1] g.f.l.s.v.XXXXValidationHandler :
Exception in thread "XXXXXXProcessor-53816dfa-6ebf-47ad-864c-0fbcb61dc51c-StreamThread-1" java.lang.RuntimeException: Unexpected exponent value: -5
Failed to process stream task 1_2 due to the following error
Shutting down
Unclean shutdown of all active tasks
Flushing all stores registered in the state manager
First node in the topology analyses if this is a duplicate, if not puts it in the state store. Next topology node throws an exception. As a result, transaction is aborted, offsets are not committed. I double checked for the changelog topic - relevant messages are not committed.
2020-04-16 07:26:43.399 DEBUG 1 --- [-StreamThread-2] c.g.f.c.s.validation.CommandIdValidator : CommandId: mnl_cmd_1cd3a7c9-56ef-4c08-a312-d9b69681726a is not a duplicate.
2020-04-16 07:26:43.400 INFO 1 --- [-StreamThread-2] c.g.f.c.s.v.CommandIdValidationUtils : Putting commandId in store: mnl_cmd_1cd3a7c9-56ef-4c08-a312-d9b69681726a 1587013200000
2020-04-16 07:26:43.425 INFO 1 --- [-StreamThread-2] g.f.l.s.v.XXXXValidationHandler :
Exception in thread "XXXXXXProcessor-d069f877-90ff-4ecf-bfbf-3b2bd0e20eac-StreamThread-2" java.lang.RuntimeException: XXXX
Again same story, transaction aborted.
2020-04-16 07:26:55.655 INFO 1 --- [-StreamThread-2] c.g.f.c.s.v.CommandIdValidationUtils : next: KeyValue(1587013200000, mnl_cmd_1cd3a7c9-56ef-4c08-a312-d9b69681726a)
2020-04-16 07:26:55.655 WARN 1 --- [-StreamThread-2] c.g.f.c.s.validation.CommandIdValidator : Command duplicate detected. Command id mnl_cmd_1cd3a7c9-56ef-4c08-a312-d9b69681726a
We analysed task restoration logs and all seems normal, aborted transactions were skipped:
Skipping aborted record batch from partition
However, the rocks db for this task contains the command id:
docker cp 134d4f6f82cf:/tmp/kafka-streams/XXXXXXProcessor/1_0/COMMAND_ID_STORE/COMMAND_ID_STORE.1587013200000 .
ldb --db=COMMAND_ID_STORE.1587013200000 scan
mnl_cmd_1cd3a7c9-56ef-4c08-a312-d9b69681726a : mnl_cmd_1cd3a7c9-56ef-4c08-a312-d9b69681726a
mnl_cmd_c51cbb71-0274-44c6-b016-976add743fe6 : mnl_cmd_c51cbb71-0274-44c6-b016-976add743fe6
Our understanding is that even though the transaction was aborted on node 1 (A), the rocks db entry wasn't cleaned afterwards, and during (C) another thread fetched data from already existing rocks db (after all, the disk path is connected with the task, not thread). We are not sure what happens with persisted store during task restoration or during a task failure. According to us, this situation breaks exactly-once semantics, as the state store is not in sync with changelog topic.
Is it a wrong configuration at our side/wrong assumptions? A bug? We tried changing the store to inMemoryWindowStore and it seems to solve the problem. But we'd like to understand this behaviour as we heavily rely on state stores.
We use Java 11, kafka clients 4.1 and spring-kafka 2.4.5. We rolled back to kafka clients 2.3.1 and the problem persists.
EDIT we performed some more tests with configuration changes and after changing num.standby.replicas = 1
to num.standby.replicas = 0
the problem disappeared. From logs it looks like that standby state is created properly, i.e. aborted messages are not taken into account, but the state on disk gets corrupted nevertheless, which is quite easy to reproduce.
One more analysis:
- Node A Thread-2 handles the message, crashes
- Node B Thread-2 is the standy replica, takes over, crashes
- Node B Thread-1 takes over, detects a duplicate
EDIT 2 detailed logs relevant to checkpoint files. At first, a valid command is sent, then an invalid one (hence, a changelog topic is not empty).
NODE_3 2020-04-15 21:06:14.470 TRACE 1 --- [-StreamThread-2] o.a.k.s.p.i.ProcessorStateManager : stream-thread [XXXXProcessor-94f7be8e-beec-411f-b4ec-9031527bccdf-StreamThread-2] task [1_2] Checkpointable offsets read from checkpoint: {}
NODE_3 2020-04-15 21:06:19.413 TRACE 1 --- [-StreamThread-2] o.a.k.s.p.i.ProcessorStateManager : stream-thread [XXXXProcessor-94f7be8e-beec-411f-b4ec-9031527bccdf-StreamThread-2] task [1_2] Restoring state store COMMAND_ID_STORE from changelog topic XXXXProcessor-COMMAND_ID_STORE-changelog at checkpoint null
NODE_3 2020-04-15 21:06:28.470 TRACE 1 --- [-StreamThread-1] o.a.k.s.p.i.ProcessorStateManager : stream-thread [XXXXProcessor-94f7be8e-beec-411f-b4ec-9031527bccdf-StreamThread-1] standby-task [1_2] Checkpointable offsets read from checkpoint: {}
NODE_3 2020-04-15 21:06:29.634 TRACE 1 --- [-StreamThread-2] o.a.k.s.s.internals.OffsetCheckpoint : Writing tmp checkpoint file /tmp/kafka-streams/XXXXProcessor/1_2/.checkpoint.tmp
NODE_3 2020-04-15 21:06:29.640 TRACE 1 --- [-StreamThread-2] o.a.k.s.s.internals.OffsetCheckpoint : Swapping tmp checkpoint file /tmp/kafka-streams/XXXXProcessor/1_2/.checkpoint.tmp /tmp/kafka-streams/XXXXProcessor/1_2/.checkpoint
NODE_3 2020-04-15 21:11:15.909 TRACE 1 --- [-StreamThread-1] o.a.k.s.s.internals.OffsetCheckpoint : Writing tmp checkpoint file /tmp/kafka-streams/XXXXProcessor/1_2/.checkpoint.tmp
NODE_3 2020-04-15 21:11:15.912 TRACE 1 --- [-StreamThread-1] o.a.k.s.s.internals.OffsetCheckpoint : Swapping tmp checkpoint file /tmp/kafka-streams/XXXXProcessor/1_2/.checkpoint.tmp /tmp/kafka-streams/XXXXProcessor/1_2/.checkpoint
NODE_1 log1:2020-04-15 21:11:33.942 DEBUG 1 --- [-StreamThread-2] c.g.f.c.s.validation.CommandIdValidator : CommandId: mnl_cmd_9f1752da-45b7-4ef7-9ef8-209d826530bc is not a duplicate.
NODE_3 2020-04-15 21:11:47.195 TRACE 1 --- [-StreamThread-1] o.a.k.s.s.internals.OffsetCheckpoint : Writing tmp checkpoint file /tmp/kafka-streams/XXXXProcessor/1_2/.checkpoint.tmp
NODE_3 2020-04-15 21:11:47.233 TRACE 1 --- [-StreamThread-1] o.a.k.s.s.internals.OffsetCheckpoint : Swapping tmp checkpoint file /tmp/kafka-streams/XXXXProcessor/1_2/.checkpoint.tmp /tmp/kafka-streams/XXXXProcessor/1_2/.checkpoint
NODE_3 2020-04-15 21:11:49.075 TRACE 1 --- [-StreamThread-2] o.a.k.s.p.i.ProcessorStateManager : stream-thread [XXXXProcessor-94f7be8e-beec-411f-b4ec-9031527bccdf-StreamThread-2] task [1_2] Restoring state store COMMAND_ID_STORE from changelog topic XXXXProcessor-COMMAND_ID_STORE-changelog at checkpoint 1
NODE_3 2020-04-15 21:11:49.436 TRACE 1 --- [-StreamThread-2] o.a.k.s.p.i.StoreChangelogReader : stream-thread [XXXXProcessor-94f7be8e-beec-411f-b4ec-9031527bccdf-StreamThread-2] Found checkpoint 1 from changelog XXXXProcessor-COMMAND_ID_STORE-changelog-2 for store COMMAND_ID_STORE.
NODE_3 2020-04-15 21:11:52.023 DEBUG 1 --- [-StreamThread-2] c.g.f.c.s.validation.CommandIdValidator : CommandId: mnl_cmd_9f1752da-45b7-4ef7-9ef8-209d826530bc is not a duplicate.
NODE_3 2020-04-15 21:11:53.683 ERROR 1 --- [-StreamThread-2] o.a.k.s.p.i.AssignedStreamsTasks : stream-thread [XXXXProcessor-94f7be8e-beec-411f-b4ec-9031527bccdf-StreamThread-2] Failed to process stream task 1_2 due to the following error: java.lang.RuntimeException
NODE_3 2020-04-15 21:12:05.346 TRACE 1 --- [-StreamThread-1] o.a.k.s.p.i.ProcessorStateManager : stream-thread [XXXXProcessor-94f7be8e-beec-411f-b4ec-9031527bccdf-StreamThread-1] task [1_2] Restoring state store COMMAND_ID_STORE from changelog topic XXXXProcessor-COMMAND_ID_STORE-changelog at checkpoint 1
NODE_3 2020-04-15 21:12:05.562 TRACE 1 --- [-StreamThread-1] o.a.k.s.p.i.StoreChangelogReader : stream-thread [XXXXProcessor-94f7be8e-beec-411f-b4ec-9031527bccdf-StreamThread-1] Found checkpoint 1 from changelog XXXXProcessor-COMMAND_ID_STORE-changelog-2 for store COMMAND_ID_STORE.
NODE_3 2020-04-15 21:12:06.424 WARN 1 --- [-StreamThread-1] c.g.f.c.s.validation.CommandIdValidator : Command duplicate detected. Command id mnl_cmd_9f1752da-45b7-4ef7-9ef8-209d826530bc
Thanks for you help.
Code operating on the COMMAND_ID_STORE (similar to an example from Kafka Streams)
Adding state store
WindowBytesStoreSupplier storeSupplier = Stores.persistentWindowStore(storeName,
RETENTION,
WINDOW_SIZE,
true);
StoreBuilder<WindowStore<String, String>> storeBuilder = Stores
.windowStoreBuilder(storeSupplier, Serdes.String(), Serdes.String());
streamsBuilder.addStateStore(storeBuilder);
Fetching
public static boolean isCommandIdDuplicate(String commandId, WindowStore<String, String> commandIdStore) {
long previousHourStart = DateTimeUtils.previousHourStart();
final WindowStoreIterator<String> storeIterator = commandIdStore
.fetch(commandId, previousHourStart, DateTimeUtils.currentTimeMillis());
final boolean hasNext = storeIterator.hasNext();
log.info("hasNext: {}", hasNext);
while (storeIterator.hasNext()) {
log.info("next: {}", storeIterator.next());
}
storeIterator.close();
return hasNext;
}
Putting
final long windowStartTimestamp = DateTimeUtils.thisHourStart();
log.info("Putting commandId in store: {} {}", commandId, windowStartTimestamp);
commandIdStore.put(commandId, commandId, windowStartTimestamp);
Transform
.transformValues(ValueTransformerSupplier<YYY,
? extends XXX>) CommandIdValidator::new,
COMMAND_ID_STORE);
Transformer
@Slf4j
public class CommandIdValidator<T extends GeneratedMessageV3> implements
ValueTransformer<T, XXX<T>> {
WindowStore<String, String> commandIdStore;
@Override
public void init(ProcessorContext context) {
commandIdStore = (WindowStore<String, String>) context.getStateStore(COMMAND_ID_STORE);
}
@Override
public XXX<T> transform(T command) {
log.debug("Validating commandId.");
String commandId = getCommandId(command);
if (CommandIdValidationUtils.isCommandIdDuplicate(commandId, commandIdStore)) {
log.warn("Command duplicate detected. Command id " + commandId);
return new XXX(command,
XXX.duplicatedCommandId(commandId, command.getClass().getName()));
} else {
log.debug("CommandId: {} is not a duplicate.", commandId);
CommandIdValidationUtils.putCommandIdInStore(commandId, commandIdStore);
return XXX.successful(command);
}
}
@Override
public void close() {
}
}
This has turned out to be a bug and has been fixed in the subsequent versions: KAFKA-9891