Search code examples
grafana-lokigrafana-alerts

Using Loki Expression to Alert for Repeated Log Occurrences in Grafana


I am using a Loki expression to monitor logs in Kubernetes pods. The expression is as follows:

count_over_time({job="namespace/files-lister"} |~`\[ftp-\[0-9\]+\] Files listing is on process`[1h]) > 1

This expression counts the occurrences of logs containing [ftp-<number>] Files listing is on process within the last 1 hour for the files-lister pod.

For example, if the logs for the last 1 hour are:

[ftp-23392] Files listing is on process 
[ftp-12423] Files listing is on process 
[ftp-12345] Files listed 
[ftp-53433] Files listing is on process 
[ftp-23392] Files listing is on process 
[ftp-23392] Files listing is on process

I want the expression to return or alert for ftp-23392, as it appears more than once in the last 1 hour. However, my current approach queries all logs with other IDs as well.

Is it possible to dynamically return or alert for the specific ID (ftp-23392 in this case) that appears more than once within the last 1 hour? Any insights would be greatly appreciated.


Solution

  • count_over_time aggregates over labels. So to aggregate over name of the file in your logs you should extract that name into label.

    This can be done with the regexp:

    count_over_time({job="namespace/files-lister"} | regexp `^\[(?P<file>ftp-\[0-9\]+)\] Files listing is on process` [1h]) > 1
    

    This query will result with number of each file appearances in logs over last hour, and name of the corresponding file will be stored in the label file.