Search code examples
logstashlogstash-grok

how do you parse text in grok


I need to capture two variables from this path using grok:

/opt/data/app_log/server101.log

server=needs to be anything after the last forward slash before the dot (in this case server101)
index=needs to be the text between the last two forward slashes (in this case app_log)

Any ideas how could do this in grok?

 grok {
                patterns_dir => ["/pattern"]
                match =>{path =>"%{WORD:dir1}\/%{WORD:dir2}\/%{WORD:index_name}\/%{WORD:server}\.%{WORD:file_type}"}
                match => {"message" => "%{TIMESTAMP_ISO8601:timestamp},%{NUMBER:Num_field} %{WORD:error_level} %{GREEDYDATA:origin}, %{WORD:logger} - %{GREEDYDATA:message}"}
        }

Solution

  • Easiest solution is

    /%{DATA:col1}/%{DATA:col2}/%{DATA:index}/%{DATA:server}\.%{GREEDYDATA:end}
    

    you can remove the names col1, col2, and end to drop those captures.

    This pattern relies on there always being the same number of parts in your URI. If there are a variable number you could use something like this.

    (?:/%{USER})*/%{DATA:index}/%{DATA:server}\.%{GREEDYDATA:end}
    

    I made and tested these using the grok constructor


    Using this pattern:

    filter {
      grok {
        match => { 
          "message" => <message-pattern>
        }
      }
      grok {
        match => { 
          "log_path" => "(?:/%{USER})*/%{DATA:index}/%{DATA:server}\.%{GREEDYDATA}"
        }
      }
    }
    

    Where "log_path" is the name of the field containing the log path after you do your normal message parsing.