We are having some troubles with Spring Cloud and Kafka, at sometimes our microservice throws an UnkownProducerIdException
, this is caused if the parameter transactional.id.expiration.ms
is expired in the broker side.
My question, could it be possible to catch that exception and retry the failed message? If yes, what could be the best option to handle it?
I have took a look at:
- https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=89068820
- Kafka UNKNOWN_PRODUCER_ID exception
We are using Spring Cloud Hoxton.RELEASE
version and Spring Kafka version 2.2.4.RELEASE
We are using AWS Kafka solution so we can't set a new value on that property I mentioned before.
Here is some trace of the exception:
2020-04-07 20:54:00.563 ERROR 5188 --- [ad | producer-2] o.a.k.c.p.internals.TransactionManager : [Producer clientId=producer-2] The broker returned org.apache.kafka.common.errors.UnknownProducerIdException: This exception is raised by the broker if it could not locate the producer metadata associated with the producerId in question. This could happen if, for instance, the producer's records were deleted because their retention time had elapsed. Once the last records of the producerId are removed, the producer's metadata is removed from the broker, and future appends by the producer will return this exception. for topic-partition test.produce.another-2 with producerId 35000, epoch 0, and sequence number 8
2020-04-07 20:54:00.563 INFO 5188 --- [ad | producer-2] o.a.k.c.p.internals.TransactionManager : [Producer clientId=producer-2] ProducerId set to -1 with epoch -1
2020-04-07 20:54:00.565 ERROR 5188 --- [ad | producer-2] o.s.k.support.LoggingProducerListener : Exception thrown when sending a message with key='null' and payload='{...}' to topic <some-topic>:
To reproduce this exception:
- I have used the confluent docker images and set the environment variable KAFKA_TRANSACTIONAL_ID_EXPIRATION_MS
to 10 seconds so I wouldn't wait too much for this exception to be thrown.
- In another process, send one by one in interval of 10 seconds 1 message in the topic the java will listen.
Here is a code example:
File Bindings.java
import org.springframework.cloud.stream.annotation.Input;
import org.springframework.cloud.stream.annotation.Output;
import org.springframework.messaging.MessageChannel;
import org.springframework.messaging.SubscribableChannel;
public interface Bindings {
@Input("test-input")
SubscribableChannel testListener();
@Output("test-output")
MessageChannel testProducer();
}
File application.yml (don't forget to set the environment variable KAFKA_HOST
):
spring:
cloud:
stream:
kafka:
binder:
auto-create-topics: true
brokers: ${KAFKA_HOST}
transaction:
producer:
error-channel-enabled: true
producer-properties:
acks: all
retry.backoff.ms: 200
linger.ms: 100
max.in.flight.requests.per.connection: 1
enable.idempotence: true
retries: 3
compression.type: snappy
request.timeout.ms: 5000
key.serializer: org.apache.kafka.common.serialization.StringSerializer
consumer-properties:
session.timeout.ms: 20000
max.poll.interval.ms: 350000
enable.auto.commit: true
allow.auto.create.topics: true
auto.commit.interval.ms: 12000
max.poll.records: 5
isolation.level: read_committed
configuration:
auto.offset.reset: latest
bindings:
test-input:
# contentType: text/plain
destination: test.produce
group: group-input
consumer:
maxAttempts: 3
startOffset: latest
autoCommitOnError: true
queueBufferingMaxMessages: 100000
autoCommitOffset: true
test-output:
# contentType: text/plain
destination: test.produce.another
group: group-output
producer:
acks: all
debug: true
The listener handler:
@SpringBootApplication
@EnableBinding(Bindings.class)
public class PocApplication {
private static final Logger log = LoggerFactory.getLogger(PocApplication.class);
public static void main(String[] args) {
SpringApplication.run(PocApplication.class, args);
}
@Autowired
private BinderAwareChannelResolver binderAwareChannelResolver;
@StreamListener(Topics.TESTLISTENINPUT)
public void listen(Message<?> in, String headerKey) {
final MessageBuilder builder;
MessageChannel messageChannel;
messageChannel = this.binderAwareChannelResolver.resolveDestination("test-output");
Object payload = in.getPayload();
builder = MessageBuilder.withPayload(payload);
try {
log.info("Event received: {}", in);
if (!messageChannel.send(builder.build())) {
log.error("Something happend trying send the message! {}", in.getPayload());
}
log.info("Commit success");
} catch (UnknownProducerIdException e) {
log.error("UnkownProducerIdException catched ", e);
} catch (KafkaException e) {
log.error("KafkaException catched ", e);
}catch (Exception e) {
System.out.println("Commit failed " + e.getMessage());
}
}
}
Regards
} catch (UnknownProducerIdException e) {
log.error("UnkownProducerIdException catched ", e);
To catch exceptions there, you need to set the sync
kafka producer property (https://cloud.spring.io/spring-cloud-static/spring-cloud-stream-binder-kafka/3.0.3.RELEASE/reference/html/spring-cloud-stream-binder-kafka.html#kafka-producer-properties). Otherwise, the error comes back asynchronously
You should not "eat" the exception there; it must be thrown back to the container so the container will roll back the transaction.
Also,
}catch (Exception e) {
System.out.println("Commit failed " + e.getMessage());
}
The commit is performed by the container after the stream listener returns to the container so you will never see a commit error here; again, you must let the exception propagate back to the container.
The container will retry the delivery according to the consumer binding's retry configuration.