Search code examples
elasticsearchlogstashamazon-elasticsearch

Logstash: Missing data after migration


I have been migrating one of the indexes in self-hosted Elasticsearch to amazon-elasticsearch using Logstash. we have around 1812 documents in our self-hosted Elasticsearch but in amazon-elasticsearch, we have only about 637 documents. Half of the documents are missing after migration.

Our logstash config file

input {
 elasticsearch {
 hosts => ["https://staing-example.com:443"]
 user => "userName"
 password => "password"
 index => "testingindex"
 size => 100
 scroll => "1m"
 }
}

filter {

}

output {
 amazon_es {
 hosts => ["https://example.us-east-1.es.amazonaws.com:443"]
 region => "us-east-1"
 aws_access_key_id => "access_key_id"
 aws_secret_access_key => "access_key_id"
 index => "testingindex"
}
stdout{
  codec => rubydebug
  }
}

We have tried for some of the other indexes as well but it still migrating only half of the documents.


Solution

  • Make sure to compare apples to apples by running GET index/_count on your index on both sides.

    You might see more or less documents depending on where you look (Elasticsearch HEAD plugin, Kibana, Cerebro, etc) and if replicas are taken into account in the count or not.

    In your case you had more replicas in your local environment than in your AWS Elasticsearch service, hence the different count.