I am using elapsed filter plugin for calculating time difference between multiple start/end events for a particular id.
if [StepName] == "Step1" and [StepStatus] == "start" {
mutate { add_tag => "Step1_start" }
} else if [StepName] == "Step1" and [StepStatus] == "end" {
mutate { add_tag => "Step1_end" }
} else if [StepName] == "Step2" and [StepStatus] == "start" {
mutate { add_tag => "Step2_start" }
} else if [StepName] == "Step2" and [StepStatus] == "end" {
mutate { add_tag => "Step2_end" }
} else if [StepName] == "Step3" and [StepStatus] == "start" {
mutate { add_tag => "Step3_start" }
} else if [StepName] == "Step3" and [StepStatus] == "end" {
mutate { add_tag => "Step3_end" }
}
elapsed{
start_tag => "Step1_start"
end_tag => "Step1_end"
unique_id_field => "FrtId"
new_event_on_match => false
timeout => 1800
}
elapsed{
start_tag => "Step2_start"
end_tag => "Step2_end"
unique_id_field => "FudtId"
new_event_on_match => false
timeout => 1800
}
elapsed{
start_tag => "Step3_start"
end_tag => "Step3_end"
unique_id_field => "FudtId"
new_event_on_match => false
timeout => 1800
}
The problem I am facing is, In-spite of the data being absolutely correct. For many documents I am getting "elapsed_end_without_start" tag. However for the same document I am having the start tag already present before in the file from which I am loading it.
Any help will be appreciated. Thanks for A2A.
I've finally got my answer.
So in future, If anyone faces the same problem. It's an issue with elapsed filter plugin. Since the no of workers which are running for your logstash is not singular.
They should mention in their documentation that elapsed filter plugin works correct only with single logstash worker. As they have mentioned for aggregate filter plugin.
It may happen that the start tag for an id is handled by one worker and end tag by another. So in this case it adds the tag "elapsed_end_without_start" for that end event.
However, if there are multiple workers. Its not that elapsed plugin will totally fail. It can give you an accuracy of about 70-80% (It gave this much to me. It's completely random)
The work around for this situation is we can define the no of logstash workers to be one. However, its not an optimal solution since data ingestion would be slow and there will be heavy load on one worker.