Search code examples
filtercpu-usagelogstashgrok

How to extract CPU Usage details from the log file in logstash


I am trying to extract the CPU usage and timestamp from the message:

2015-04-27T11:54:45.036Z| vmx| HIST ide1 IRQ 4414 42902 [  250 -   375 ) count:    2 (0.00%) min/avg/max: 250/278.50/307

I am using logstash and here is my logstash.config file:

input {
    file {
    path => "/home/xyz/Downloads/vmware.log"
    start_position => beginning
    }
}

 filter {
    grok{
            match => ["message", "%{@timestamp}"]
    }
}
 output{
    stdout {
    codec => rubydebug
    }
    }

But its giving me grok parse error, Any help would really be appreciated. Thanks.


Solution

  • As per the message from Magnus, you're using the grok match function incorrectly, @timestamp is the name of a system field that logstash uses as the timestamp the message was recieved at, not the name of a grok pattern.

    First I recommend you have a look at some of the default grok patterns you can use which can be found here, then I also recommend you use the grok debugger finally, if all else fails, get yourself in the #logstash irc channel (on freenode), we're pretty active in there, so I'm sure someone will help you out.

    Just to help you out a bit further, this is a quick grok pattern I have created which should match your example (I only used the grok debugger to test this, so results in production might not be perfect - so test it!)

    filter {
      grok {
        match => [ "message", "%{TIMESTAMP_ISO8601}\|\ %{WORD}\|\ %{GREEDYDATA}\ min/avg/max:\ %{NUMBER:minimum}/%{NUMBER:average}/%{NUMBER:maximum}" ]
      }
    }
    

    To explain slightly, %{TIMESTAMP_ISO8601} is a default grok pattern which matches the timestamp in your example.

    You will notice the use of \ quite a lot, as the characters following this need to be escaped (because we're using a regex engine and spaces, pipes etc have a meaning, by escaping them we disable that meaning and use them literally).

    I have used the %{GREEDYDATA} pattern as this will capture anything, this can be useful when you just want to capture the rest of the message, if you put it at the end of the grok pattern it will capture all remaining text. I have then taken a bit from your example (min/avg/max) to stop the GREEDYDATA from capturing the rest of the message, as we want the data after that.

    %{NUMBER} will capture numbers, obviously, but the bit after the : inside the curly braces defines the name that field will be given by logstash and subsequently saved in elasticsearch.

    I hope that helps!