Search code examples
prometheusgrafanalogstash-grok

Grok-exporter is active running, but metrics don't show up [service error: Invalid configuration]


My main aim is to show log file in a prometheus server. For that, I'm using grok-exporter.

To do so, I've showed path to my log file path and changed metrics type and matches. As shown below:

global:
  config_version: 3
input:
  type: file
  path: /tmp/model.log
  readall: true # Read from the beginning of the file? False means we start at>
  fail_on_missing_logfile: true
imports:
- type: grok_patterns
  dir: /opt/grok-exporter/patterns
grok_patterns:
- 'METRICS [a-zA-Z ]'
metrics:
- type: gauge
  name: model_log
  help: Average duration of model
  match: '%{DATE:date} %{TIME:time} %{METRICS:metrics} %{NUMBER:avg_hour}'
  value: '{{.avg_hour}}'
  labels:
    metrics: '{{.metrics}}'
server:
    protocol: http
    port: 9144


And my log file looks like:

2021-11-09 15:18:17 avg_hour 0.9
2021-11-09 15:20:06 avg_hour 0.5
2021-11-09 15:20:06 avg_hour 0.4

When I started grok-exporter.server at 9144, only default metrics were launched.
One of them is 'grok_exporter_line_processing_errors_total'. Which shows:

grok_exporter_line_processing_errors_total{metrics="model_log"} 0

which means I've zero error.

But, I couldn't see my metric 'model_log' in the server. Maybe I've wrong grok pattern types, or grok couldn't access to the model_log at '/tmp/'. But if that's true, error should be appeared.

UPDATE: When I write on command line:

journalctl -eu grok-exporter

there's an error:

... status =255/EXCEPTION
grok-exporter.service failed with results 'exit-code'
...
Failed to load ~/config.yml: invalid configuration: yaml: line 21: could not find expected ':'
...

But the:

systemctl status grok-exporter.service

is active running. What might be the problem? I think there's no problem with ':' in line 21.


Solution

  • I think the problem lies in the patterns you're using to identify your metrics - the github page (http://github.com/fstab/grok_exporter/blob/master/CONFIG.md) touches upon this but there's not a great deal out there that seems to explain the behaviour.

    Grok needs recognisable patterns, and if you're attempting to define a pattern for a column in your logs that doesn't match the data trying to be parsed, Grok won't act on it. This is evident when you search the http://localhost:9145/metrics (or whatever port you're hosting your target on) and look for the line:

      grok_exporter_lines_matching_total{metric="log_events_total_count"} 0
    

    In that example I've called my metric 'log_events_total_count' and Grok has been unable to identify a pattern based on the following:

      grok_patterns:
      - 'METRIC [a-zA-Z ]'
      metrics:
        - type: gauge
          name: log_events_total_count
          help: Average duration of model
          match: '%{DATE:date} %{TIME:time} %{METRIC:event} %{NUMBER:num}'
          value: '{{.num}}'
          labels:
            Event: '{{.event}}'
    

    If you remove the '%{NUMBER:num}' and change your gauge to a counter, you should find that only 'a' will appear for the Event label, similar to the following:

      CONFIG SNIPPET:
        grok_patterns:
        - 'METRIC [a-zA-Z ]'
        metrics:
          - type: counter
            name: log_events_total_count
            help: Average duration of model
            match: '%{DATE:date} %{TIME:time} %{METRIC:event}'
            labels:
              Event: '{{.event}}'
    
      METRICS RESULT:
        log_events_total_count{Event="a"} 3
    

    There's two things to consider for your logs in particular:

    1. What regular expression identifies the pattern for the event column
    2. What regular expression best represents the value in the final column

    I'd highly recommend using the regexr website which allows you to input some sample text and then try out different regular expressions >> https://regexr.com/

    To address your METRIC pattern, change it to include w+ so that a "word" is trying to be matched:

       grok_patterns:
          - 'METRIC ([a-zA-Z])\w+'
    

    For your numerical value, as it's a floating point you can use the NUMBER grok_pattern and convert it to represent the decimal point:

      %{NUMBER:num:float}
    

    With that in place, the gauge counter should present you with something similar to this:

      log_events_total_count{Event="avg_hour"} 0.4
    

    Hopefully this helps!