I have log files that I am passing into logstash to be modified before pushing to elasticsearch.
One of the fields that I have sometimes appears as a series of digits
foobar = 42
Sometimes it is prefixed with letters
foobar = ws-42
I want to make sure the field is always an integer, and if any non-digits are present, that they are removed.
Here is part of the logstash config which makes sure the field is an integer
filter {
mutate {
convert => [ "foobar", "integer"]
}
}
How can I strip out the characters if present?
Update
By using the mutate filter I can either strip out non numerical values, or I can convert to integers. However if I try and do both, it returns 0.
Example
input {
stdin {}
}
filter {
kv { }
mutate {
gsub => [ "foobar", "\D", "" ]
convert => [ "foobar", "integer" ]
}
}
Here is the output. Notice that if '42' is provided, then foobar returns an integer of 42, however if you provide 'sw-42' foobar returns 0
foobar="42"
{
"message" => "foobar=\"42\"",
"@version" => "1",
"@timestamp" => "2015-03-31T22:32:11.718Z",
"host" => "swat-logstash02",
"foobar" => 42
}
foobar="sw-42"
{
"message" => "foobar=\"sw-42\"",
"@version" => "1",
"@timestamp" => "2015-03-31T22:32:23.822Z",
"host" => "swat-logstash02",
"foobar" => 0
}
It's a scoping issue.
If you do just the gsub (without the convert), it shows that the regexp is working:
{
"message" => "foobar=\"sw-42\"",
"@version" => "1",
"@timestamp" => "2015-03-31T22:42:40.097Z",
"host" => "0.0.0.0",
"foobar" => "42"
}
so you should run it as two stanzas:
filter {
kv { }
mutate {
gsub => [ "foobar", "\D", "" ]
}
mutate {
convert => [ "foobar", "integer" ]
}
}