Search code examples
elasticsearchlogstashlogstash-jdbc

Duplication in logstash pipeline (input elasticsearch and output sql database)


I am using elasicsearch index as my input in my logstash config and the output is jdbc-output plugin logstash that send logs to sql database table columns , and the problem is I have duplication in sql database , I used uuid filter plugin logtash but nothing changed.what is the reason and what solution do you suggest? here is my config :

input{
    elasticsearch {
        hosts => "ip:9200"
        index => "indexname"
        user => "user"
        password => "elastic"
        query => '{ "query": { "query_string": { "query": "*" } } }'
        schedule => "*/5 * * * *"   #Specifies how often the query should be executed. In this case, it's set to run every 5 minutes
        size => 1500   #Specifies the maximum number of documents to retrieve per query
        scroll => "5m" #Specifies how long Elasticsearch should keep the search context open for the query. In this case, it's set to 5 minutes
        docinfo => true
      }
}
filter {
     uuid {
        target    => "document_id"
        overwrite => true
      }
   }
    
output {
  if "API_REQUEST" in [message] {
    jdbc {
      driver_jar_path => '/usr/share/logstash/vendor/jar/jdbc/mssql-jdbc-12.2.0.jre8.jar'
      connection_string => "jdbc:sqlserver://ip:1433;databaseName=izdb;user=user;password=pass;ssl=false;trustServerCertificate=true"
      enable_event_as_json_keyword => true
      statement => [
"INSERT INTO Transaction (document_id, logLevel, timestamp) VALUES (?,?,?)",
        "document_id",
        "logLevel",
        "timestamp"
      ]
    }
  }
}
}

Solution

  • I'm sharing a couple of ways to solve/detect the problem.

    1. add docinfo_target => "[@metadata][doc]" to the input part and restarting the logstash again can help.

    2. Update the filter uuid and use %{[@metadata][doc][_id]} rather than document_id.

    3. add stdout{} to the logstash output and observe the output to find the root cause.

    https://www.elastic.co/guide/en/logstash/current/plugins-inputs-elasticsearch.html#plugins-inputs-elasticsearch-docinfo