Search code examples
kubernetesgrafana-lokipromtail

How to install Loki+Promtail to forward K8S pod logs to Grafana Cloud


I am still new to K8S infrastructure but I am trying to convert VM infrastructure to K8S on GCP/GKE and I am stuck at forwarding the logs properly after getting Prometheus metrics forwarded correctly. I am also trying to do this without helm, to better understand K8S.

The logs of the loki pod, look as expected when comparing to a docker format in a VM setup. But I do not know how to start the promtail service without a port, since in a docker format promtail does not have to expose a port. I get the following error:

The Service "promtail" is invalid: spec.ports: Required value

My configuration files look like: loki-config.yml

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

ingester:
  wal:
    enabled: true
    dir: /tmp/wal
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 1h       # Any chunk not receiving new logs in this time will be flushed
  max_chunk_age: 1h           # All chunks will be flushed when they hit this age, default is 1h
  chunk_target_size: 1048576  # Loki will attempt to build chunks up to 1.5MB, flushing first if chunk_idle_period or max_chunk_age is reached first
  chunk_retain_period: 30s    # Must be greater than index read cache TTL if using an index cache (Default index read cache TTL is 5m)
  max_transfer_retries: 0     # Chunk transfers disabled

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

storage_config:
  boltdb_shipper:
    active_index_directory: /tmp/loki/boltdb-shipper-active
    cache_location: /tmp/loki/boltdb-shipper-cache
    cache_ttl: 24h         # Can be increased for faster performance over longer query periods, uses more disk space
    shared_store: filesystem
  filesystem:
    directory: /tmp/loki/chunks

compactor:
  working_directory: /tmp/loki/boltdb-shipper-compactor
  shared_store: filesystem

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  ingestion_burst_size_mb: 16
  ingestion_rate_mb: 16
chunk_store_config:
  max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: false
  retention_period: 0s

ruler:
  storage:
    type: local
    local:
      directory: /tmp/loki/rules
  rule_path: /tmp/loki/rules-temp
  alertmanager_url: http://localhost:9093
  ring:
    kvstore:
      store: inmemory
  enable_api: true

promtail-config.yml

server:
  http_listen_port: 9080
  grpc_listen_port: 0

# this is the place where promtail will store the progress about how far it has read the logs
positions:
  filename: /tmp/positions.yaml

# address of loki server to which promtail should push the logs
clients:
  - url: https://999999:[email protected]/api/prom/push
# which logs to read/scrape
scrape_configs:
  - job_name: system
    static_configs:
    - targets:
        - localhost
      labels:
        job: varlogs
        __path__: /var/log/*log
  - job_name: node
    static_configs:
    - targets:
        - localhost
      labels:
        job: node  # label-1
        host: localhost    # label-2
        __path__: /var/lib/docker/containers/*/*log

Then the deployment files: loki-deploy.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: loki
spec:
  selector:
    matchLabels:
      app: loki
      network: cluster-1
  replicas: 1
  template:
    metadata:
      labels:
        app: loki
        network: cluster-1
    spec:
      containers:
        - name: loki
          image:  grafana/loki
          ports:
            - containerPort: 3100
          volumeMounts:
            - name: loki-config-volume
              mountPath: /etc/loki/loki.yml
              subPath: loki.yml
      volumes:
        - name: loki-config-volume
          configMap:
            name: "loki-config"
---
apiVersion: v1
kind: Service
metadata:
  name: loki
  namespace: monitoring
spec:
  selector:
    app: loki
  type: NodePort
  ports:
  - name: loki
    protocol: TCP
    port: 3100

And finally promtail-deploy.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: promtail
spec:
  selector:
    matchLabels:
      app: promtail
      network: cluster-1
  replicas: 1
  template:
    metadata:
      labels:
        app: promtail
        network: cluster-1
    spec:
      containers:
        - name: promtail
          image:  grafana/promtail
          volumeMounts:
            - name: promtail-config-volume
              mountPath: /mnt/config/promtail-config.yml
              subPath: promtail.yml
      volumes:
        - name: promtail-config-volume
          configMap:
            name: "promtail-config"
---
apiVersion: v1
kind: Service
metadata:
  name: promtail
  namespace: monitoring

Solution

  • The issue you're describing is answered exactly by the error message.

    Your second Kubernetes Service manifest, named promtail, does not have any specification. For services, at least spec.ports is required. You should add a label selector as well, so the Service can pick up the Deployment's pods properly.

    apiVersion: v1
    kind: Service
    metadata:
      name: promtail
      namespace: monitoring
    spec:
      selector:
        app: promtail
      ports:
        - port: <ServicePort>
          targetPort: <PodPort>
    

    However, if you do not need to communicate with the Promtail pods from external services, then simply skip creating the Service itself.

    May I add, if you need to expose these logs to a service running outside of your cluster, such as Grafana Cloud, you should create a Service of LoadBalancer type for Loki instead. This will request a public IP for it, making it accessible worldwide - assuming your Kubernetes cluster is managed by some cloud provider.

    Making Loki public is insecure, but a good first step towards consuming these logs externally.