I am using a Loki expression to monitor logs in Kubernetes pods. The expression is as follows:
count_over_time({job="namespace/files-lister"} |~`\[ftp-\[0-9\]+\] Files listing is on process`[1h]) > 1
This expression counts the occurrences of logs containing [ftp-<number>] Files listing is on process within the last 1 hour for the files-lister pod.
For example, if the logs for the last 1 hour are:
[ftp-23392] Files listing is on process
[ftp-12423] Files listing is on process
[ftp-12345] Files listed
[ftp-53433] Files listing is on process
[ftp-23392] Files listing is on process
[ftp-23392] Files listing is on process
I want the expression to return or alert for ftp-23392, as it appears more than once in the last 1 hour. However, my current approach queries all logs with other IDs as well.
Is it possible to dynamically return or alert for the specific ID (ftp-23392 in this case) that appears more than once within the last 1 hour? Any insights would be greatly appreciated.
count_over_time
aggregates over labels. So to aggregate over name of the file in your logs you should extract that name into label.
This can be done with the regexp
:
count_over_time({job="namespace/files-lister"} | regexp `^\[(?P<file>ftp-\[0-9\]+)\] Files listing is on process` [1h]) > 1
This query will result with number of each file appearances in logs over last hour, and name of the corresponding file will be stored in the label file
.