Search code examples
kubernetesprometheusgrafanaseldonseldon-core

Seldon: How to Use My Own Grafana and Prometheus Instances?


I want to use my already existing Prometheus and Grafana instances in the monitoring namespace to emulate what seldon-core-analytics is doing. I'm using the prometheus community helm charts and installed kube-prometheus-stack on k8s. Here's what I've done so far:

In the values.yaml file, under the prometheus config, I added the following annotations:

annotations:
  prometheus.io/scrape: "true"
  prometheus.io/path: "/prometheus

Next, I looked at the prometheus-config.yaml in their Github repo and copied and pasted the configuration in a configmap file.

Also, created a ServiceMonitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: seldon-servicemonitor-default
  labels:
    seldon-monitor: seldon-default
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app.kubernetes.io/managed-by: seldon-core
  endpoints:
    - interval: 15s
      path: /metrics
      port: http
    - interval: 15s
      path: /prometheus
      port: http
  namespaceSelector:
    matchNames:
      - seldon
      - default
      - monitoring

No errors with the above steps so far, but it doesn't appear as though the prometheus instance is able to scrape the metrics from a model I deployed on a different namespace. What other configuration do I need to do so that my own Prometheus and Grafana instances can gather and visualize the metrics from my seldon deployed models? The documentation doesn't really explain how to do this on your own instances, and the one they provide to you through seldon-core-analytics isn't production-ready.


Solution

  • Prometheus configuration in seldon-core-analytics is quite standard. It is based on built-in Kubernetes service discovery and it uses annotations to find scraping targets:

    annotations:
      prometheus.io/scrape: true
      prometheus.io/path: /metrics
      prometheus.io/scheme: http
      prometheus.io/port: 9100
    

    In their example configuration prometheus will target pods, services, and endpoints with prometheus.io/scrape: true annotation on them. The other three labels are used to override default scraping parameters per target. Thus if you have a config as in the example, you only need to put some of these annotations on pods.

    The way kube-prometheus-stack works is different. It uses prometheus operator and CRDs to shape the configuration. This design document describes purpose of each CRD.

    You need to create a ServiceMonitor resource in order to define a scraping rule for new services. ServiceMonitor itself should have labels as defined in prometheus resource (another CRD) under serviceMonitorSelector key. It is hard to provide you with a working example in these circumstances but this short guide should be enough to understand what to do.

    I suggest you describe one of the ServiceMonitors that you have, then create a new one changing labels under matchLabels. Do not change the namespace in a new object, prometheus operator does not look for ServiceMonitors in other namespaces by default. To make ServiceMonitor discover targets in all namespaces the namespaceSelector has to be empty:

    spec:
      namespaceSelector:
        any: true