Search code examples
grafanagrafana-lokilogqlgrafana-dashboard

Calculating Unique System Counts per Error Message with Loki Query in Grafana Dashboard


I am currently working on a Grafana dashboard where I visualize various errors logged in different systems and count the number of times these errors occur. I am using Loki to send journal logs to Grafana. I've managed to create a query that groups error messages and counts their occurrences over a span of 7 days.

Here is my current Loki query:

topk(10, sum by(message)(count_over_time({job="systemd-journal"} |~ `ERROR:` | regexp `(?P<message>ERROR:.*)` [7d])))

The output from this query is as follows: output

Now, I would like to extend this functionality to include an additional value which indicates from how many different systems these errors are coming. Ideally, this would be displayed next to the current value in the dashboard.

I tried adding a "Group by" transformation in Grafana, grouping by the message field and counting unique system identifiers (like hostname) associated with each error message. Here was my attempted query modification:

sum by(message, hostname)(count_over_time({job="systemd-journal"} |~ `ERROR:` | regexp `(?P<message>ERROR:.*)` [7d]))

I expected to see an additional column indicating the count of unique systems per error message. However, this approach doesn't seem to work, as I end up with no data when adding the "Group by" transformation in Grafana.

My output for the query above without the group by transformation: output2


Solution

  • Solved it with markalex suggestion count by(hostname) ( <your attempt sum by(message, hostname) ...> )