Search code examples
apache-kafka-connect

Restart kafka connect sink and source connectors to read from beginning


I have searched quite a lot on this but there doesn't seems to be a good guide around this.

From what I have searched there are a few things to consider:

  • Resetting Sink Connector internal topics (status, config and offset).
  • Source Connector offsets implementation is implementation specific.

Question: Is there even a need to reset these topics?

  • Deleting the consumer group.
  • Restarting the connector with a different name (this is also an option) but it doesn't seems to be the right thing to do.
  • Resetting consumer group to --reset-offsets to --to-earliest
  • Using the REST API (Does the it provides the functionality to reset and read from beginning)

What would be the best way to restart both a sink and a source connector to read from beginning?


Solution

  • Source Connector:

    • Standalone mode: remove offset file (/tmp/connect.offsets) or change connector name.
    • Distributed mode: change name of the connector.

    Sink Connector (both modes) one of the following methods:

    • Change name.
    • Reset offset for the Consumer group. Name of the group is same as Connector name.

    To reset offset you have to first delete connector, reset offset (./bin/kafka-consumer-groups.sh --bootstrap-server :9092 --group connectorName --reset-offsets --to-earliest --execute --topic topicName), add same configuration one more time

    You can check following question: Reset the JDBC Kafka Connector to start pulling rows from the beginning of time?