Suppose I have a csv file looking like this:
cummulated_values
0
2
5
10
17
How can I use logstash filters to add a new "values" column, which rows are defined as values[n] := cummulated_values[n] - cummulated_values[n-1], where 0 < n <= total number of rows and values[0] := cummulated_values[0], where cummulated_values[n] means n-th row of "cummulated_values" column? So the output table will look like this:
cummulated_values, values
0, 0
2, 2
5, 3
10, 5
17, 7
I would implement that using a ruby filter.
csv { autodetect_column_names => true }
ruby {
code => '
c = event.get("cummulated_values").to_i
@values ||= c
event.set("values", c - @values)
@values = c
'
}
You need the order of events to be preserved and you need all the events to go through the same instance of the ruby filter. Therefore you must set pipeline.workers to 1, and verify that if pipeline.ordered is set then it is set to either auto or true (the default value is auto, so if you have not set it you will be OK).