Search code examples
grafanalogstash-grokprometheus

Grok exporter count doesn't decreases even if there are no errors currently


We have configured Grok exporter to monitor errors from the web service logs. We see that even when there are NO errors it still prints the past count of errors.

We have used "gauge" as the metric type and polling the log file every 5 secs.

Please see the config.yml below:

global:
  config_version: 2
input:
  type: file
  path: /ZAMBAS/logs/Healthcheck/AI/ai_17_grafana.log
  readall: true
  poll_interval_seconds: 5

grok:
  patterns_dir: ./patterns

metrics:
    - type: counter
      name: OutOfThreads
      help: Counter metric example with labels.
      match: '%{GREEDYDATA} WARN!! OUT OF THREADS: %{GREEDYDATA}'

    - type: counter
      name: OutOfMemory
      help: Counter metric example with labels.
      match: '%{GREEDYDATA}: Java heap space'

    - type: gauge
      name: NoMoreEndpointPrefix
      help: Counter metric example with labels.
      match: '%{GREEDYDATA}: APPL%{NUMBER:val1}: IO Exception: Connection refused %{GREEDYDATA}'
      value: '{{.val1}}'
      cumulative: false


    - type: gauge
      name: IOExceptionConnectionReset
      help: Counter metric example with labels.
      match: '   <faultstring>APPL%{NUMBER:val3}: IO Exception: Connection reset'
      value: '{{.val3}}'
      cumulative: false


    - type: gauge
      name: IOExceptionReadTimedOut
      help: Counter metric example with labels.
      match: '   <faultstring>APPL%{NUMBER:val4}: IO Exception: Read timed out'
      value: '{{.val4}}'
      cumulative: false


    - type: gauge
      name: FailedToConnectTo
      help: Counter metric example with labels.
      match: "   <faultstring>RUNTIME0013: Failed to connect to '%{URI:val5}"
      value: '{{.val5}}'
      cumulative: false

server:
port: 9244



Output:

grok_exporter_lines_matching_total{metric="FailedToConnectTo"} 0
grok_exporter_lines_matching_total{metric="IOExceptionConnectionReset"} 0
grok_exporter_lines_matching_total{metric="IOExceptionReadTimedOut"} 3
grok_exporter_lines_matching_total{metric="NoMoreEndpointPrefix"} 0
grok_exporter_lines_matching_total{metric="OutOfMemory"} 0
grok_exporter_lines_matching_total{metric="OutOfThreads"} 0

Say, for 1 hr there were No errors, still it shows '3' errors and when an error does occurs it keeps adding up. So in total it becomes 4 and so on..it keeps on adding :(

I want grok to show only the present data without adding previous values.

Please help us here on what am I doing wrong.

Thanks Priyotosh


Solution

  • This is the correct behaviour. What you want to do is use the rate() function in Prometheus to calculate how many relevant log lines there are per second. For example rate(OutOfThreads[5m]).