Search code examples
elasticsearchlogstashelastic-stackelk

Inconsistency streaming behavior with Logstash - ELK


I have an index with several flat fields and several nested fields. I am trying to stream info from SQL Server through the Logstash into a nested field by a specific Id. When I stream the data for only one Id then it passes fully and successfully without any problem. On the other hand, when I try to stream the data for more than one id - the info that is inserted to the index is partial for some reason. Note: The query is sorted by id. Moreover, in different tries streaming the data, a different amount of information is obtained. For example, suppose the full info contains 15 rows. In one try - only 2 rows is obtained, but in another try - 14 rows is obtained, seemingly completely arbitrarily. Does anyone have any idea what can cause this strange behavior? I would be happy for any help. Thanks!


Solution

  • This is because of the Logstash execution model where several workers can work in parallel and your events might be processed by different worker threads.

    If you want to have a consistent loading behavior you need to execute your pipeline with a single worker (-w 1 on the command line)