Search code examples
logginggrafanagrafana-lokilogql

How to determine which services did not send logs in the last 24h Grafana Loki LogQL


My company has a lot of customers, each having 1 or more services. Each of these services sends logs to a loki server. Each Service is unique by the combination of the 2 labels customer_id and service_name. I would like to have a grafana panel that has a table with a list of all services that did not send any logs in the last 24 hours.

I make queries based on the 2 labels customer_id and service_name. I have all possible values stored in variables with the same name on the grafana dashboard. I tried using the absent_over_time function,

absent_over_time({customer_id=~"$customer_id", service_name=~"$service_name"}[24h])

but here I have the problem that if one of the combinations of service_name and customer_id returns a stream then the function returns no data. All help would be apprechiated.


Solution

  • I did not find any solution to the problem described in my question, but I found a workaround:
    The servers where my services are running were also using Prometheus to send information about the services. Therefore I had the up metric of Prometheus available.

    What I wanted on my dashboard was a Panel with information on services that did not send any logs in the last 24 hours, but did send logs in the last say 7 days. Since the up metric reporting any value worked as a tell of the server not sending other things like logs properly, I made 2 queries, 1 retrieving the last up value over 24 hours and 1 retrieving the last up value over 7 days.

    Then I used transformations to merge the results of the 2 queries and group by agent_hostname (server_identifier unique per customer) and customer_id. Afterwards I filtered the lines to display only those where the 24 hour query did not have any data.