Search code examples
mongodbapache-kafkaapache-kafka-streamsfilebeat

Kafka Data Stream ID


I am new to Kafka and trying to build a pipeline for my apache httpd logs to mongodb.

I have data produced from Filebeat with Kafka Output. I am then using Kstreams to read from the topic and mapValues the data and stream out to a different topic. The data is then to be sinked out using Kafka Connect to a database (MongoDB). Unfortunately my data from Filebeat does not come with an ID.

How can I create IDs for them as I would like to create a unique ID and insert it into the document before sinking it to mongodb? I am hoping this can happen in the mapValues transformation;


Solution

  • I think you could use a combination of partition and offset to create a unique id per message. You might want to add topic if you want to make it unique across topics.