Search code examples
apache-kafkalogstashlogstash-configuration

Is it possible to mention partition id in the logstash output config?


Is it possible to mention partition id in the logstash output config?

  logstash.conf: |
    input {
        kafka {
                group_id => "test-group"
                bootstrap_servers => "cluster_one"
                topics => ["topic-one"]
                codec => "json"
                type => "test"
                decorate_events => true
        }
    }

    filter {
      mutate{
            add_field => { "[partition_number]" => "%{[@metadata][kafka][partition]}"}
        }
    }

    output {
       kafka {
         topic_id => ["topic-one"]
         bootstrap_servers => "cluster_two"
         codec => json    
      }
        
    }

this is my logstash config, is there a way I can mention the partition_number in the output? my requirement is that events from one partition id of kafka cluster_one should go to the same partition id on the kafka cluster_two.


Solution

  • First, Logstash cannot guarantee your destination topic has the same number of partitions.

    Otherwise,

    • If your data has keys
    • Assuming logstash uses Kafka DefaultPartitioner Java class
    • The original producer doesn't have any custom partition logic

    Then data will be computed to arrive at the same partition, and you don't need any filter/mutating logic, or deserialization