Search code examples
apache-kafkaspring-cloudspring-kafka

How to handle UnkownProducerIdException


We are having some troubles with Spring Cloud and Kafka, at sometimes our microservice throws an UnkownProducerIdException, this is caused if the parameter transactional.id.expiration.ms is expired in the broker side.

My question, could it be possible to catch that exception and retry the failed message? If yes, what could be the best option to handle it?

I have took a look at:
- https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=89068820
- Kafka UNKNOWN_PRODUCER_ID exception

We are using Spring Cloud Hoxton.RELEASE version and Spring Kafka version 2.2.4.RELEASE

We are using AWS Kafka solution so we can't set a new value on that property I mentioned before.

Here is some trace of the exception:

2020-04-07 20:54:00.563 ERROR 5188 --- [ad | producer-2] o.a.k.c.p.internals.TransactionManager   : [Producer clientId=producer-2] The broker returned org.apache.kafka.common.errors.UnknownProducerIdException: This exception is raised by the broker if it could not locate the producer metadata associated with the producerId in question. This could happen if, for instance, the producer's records were deleted because their retention time had elapsed. Once the last records of the producerId are removed, the producer's metadata is removed from the broker, and future appends by the producer will return this exception. for topic-partition test.produce.another-2 with producerId 35000, epoch 0, and sequence number 8
2020-04-07 20:54:00.563  INFO 5188 --- [ad | producer-2] o.a.k.c.p.internals.TransactionManager   : [Producer clientId=producer-2] ProducerId set to -1 with epoch -1
2020-04-07 20:54:00.565 ERROR 5188 --- [ad | producer-2] o.s.k.support.LoggingProducerListener    : Exception thrown when sending a message with key='null' and payload='{...}' to topic <some-topic>:

To reproduce this exception:
- I have used the confluent docker images and set the environment variable KAFKA_TRANSACTIONAL_ID_EXPIRATION_MS to 10 seconds so I wouldn't wait too much for this exception to be thrown.
- In another process, send one by one in interval of 10 seconds 1 message in the topic the java will listen.

Here is a code example:

File Bindings.java

import org.springframework.cloud.stream.annotation.Input;
import org.springframework.cloud.stream.annotation.Output;
import org.springframework.messaging.MessageChannel;
import org.springframework.messaging.SubscribableChannel;

public interface Bindings {
  @Input("test-input")
  SubscribableChannel testListener();

  @Output("test-output")
  MessageChannel testProducer();
}

File application.yml (don't forget to set the environment variable KAFKA_HOST):

spring:
  cloud:
    stream:
      kafka:
        binder:
          auto-create-topics: true
          brokers: ${KAFKA_HOST}
          transaction:
            producer:
              error-channel-enabled: true
          producer-properties:
            acks: all
            retry.backoff.ms: 200
            linger.ms: 100
            max.in.flight.requests.per.connection: 1
            enable.idempotence: true
            retries: 3
            compression.type: snappy
            request.timeout.ms: 5000
            key.serializer: org.apache.kafka.common.serialization.StringSerializer
          consumer-properties:
            session.timeout.ms: 20000
            max.poll.interval.ms: 350000
            enable.auto.commit: true
            allow.auto.create.topics: true
            auto.commit.interval.ms: 12000
            max.poll.records: 5
            isolation.level: read_committed
          configuration:
            auto.offset.reset: latest

      bindings:

        test-input:
          # contentType: text/plain
          destination: test.produce
          group: group-input
          consumer:
            maxAttempts: 3
            startOffset: latest
            autoCommitOnError: true
            queueBufferingMaxMessages: 100000
            autoCommitOffset: true


        test-output:
          # contentType: text/plain
          destination: test.produce.another
          group: group-output
          producer:
            acks: all

debug: true

The listener handler:

@SpringBootApplication
@EnableBinding(Bindings.class)
public class PocApplication {

    private static final Logger log = LoggerFactory.getLogger(PocApplication.class);

    public static void main(String[] args) {
        SpringApplication.run(PocApplication.class, args);
    }


    @Autowired
    private BinderAwareChannelResolver binderAwareChannelResolver;

    @StreamListener(Topics.TESTLISTENINPUT)
    public void listen(Message<?> in, String headerKey) {
        final MessageBuilder builder;
        MessageChannel messageChannel;

        messageChannel = this.binderAwareChannelResolver.resolveDestination("test-output");

        Object payload = in.getPayload();
        builder = MessageBuilder.withPayload(payload);

        try {
            log.info("Event received: {}", in);

            if (!messageChannel.send(builder.build())) {
                log.error("Something happend trying send the message! {}", in.getPayload());
            }

            log.info("Commit success");
        } catch (UnknownProducerIdException e) {
            log.error("UnkownProducerIdException catched ", e);
        } catch (KafkaException e) {
            log.error("KafkaException catched ", e);
        }catch (Exception e) {
            System.out.println("Commit failed " + e.getMessage());
        }
    }
}

Regards


Solution

  •         } catch (UnknownProducerIdException e) {
                log.error("UnkownProducerIdException catched ", e);
    

    To catch exceptions there, you need to set the sync kafka producer property (https://cloud.spring.io/spring-cloud-static/spring-cloud-stream-binder-kafka/3.0.3.RELEASE/reference/html/spring-cloud-stream-binder-kafka.html#kafka-producer-properties). Otherwise, the error comes back asynchronously

    You should not "eat" the exception there; it must be thrown back to the container so the container will roll back the transaction.

    Also,

            }catch (Exception e) {
                System.out.println("Commit failed " + e.getMessage());
            }
    

    The commit is performed by the container after the stream listener returns to the container so you will never see a commit error here; again, you must let the exception propagate back to the container.

    The container will retry the delivery according to the consumer binding's retry configuration.