Search code examples
kubernetestraefikjaegertraefik-ingress

Expose Jaeger Collector to clients outside of cluster


I am using the Jaeger Operator to deploy the Jaeger Query and Collector services to Kubernetes (K3S actually) along with an ElasticSearch instance for the storage backend.

The Jaeger Operator creates an Ingress instance for the Jaeger Query service but it assumes that all of your Jaeger Agents will also be running inside the Kubernetes cluster. Unfortunately, that is not the case for me as some applications that I am tracing are not run within the cluster so I need my Jaeger Collector to be accessible from outside.

This Jaeger GitHub issue discusses a potential enhancement to the Jaeger Operator for this functionality and it suggests creating your own Ingress outside of the Operator to expose the Jaeger Collector but doesn't go into details.

I also want to utilize gRPC for the communication between the Agent outside the cluster and the Collector in the cluster and this article describes how to set up an Ingress for gRPC (though it is not specific to Jaeger). I used the example ingress spec there, made some tweaks for my scenario, and deployed it to my cluster:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
  name: simple-prod-collector
  namespace: monitoring
spec:
  rules:
  - host: jaeger-collector.my-container-dev
    http:
      paths:
      - backend:
          serviceName: simple-prod-collector
          servicePort: 14250

This creates an Ingress for me alongside the simple-prod-query ingress that is created by the Jaeger Operator:

NAMESPACE      NAME                    CLASS    HOSTS                               ADDRESS          PORTS   AGE
monitoring    simple-prod-query       <none>   jaeger-query.my-container-dev       10.128.107.220   80      6h56m
monitoring    simple-prod-collector   <none>   jaeger-collector.my-container-dev                    80      4h33m

Here are the services behind the ingress:

NAMESPACE        NAME                             TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                                  AGE
monitoring       simple-prod-collector            ClusterIP      10.43.20.131    <none>           9411/TCP,14250/TCP,14267/TCP,14268/TCP   7h5m
monitoring       simple-prod-query                ClusterIP      10.43.141.211   <none>           16686/TCP                                7h5m
monitoring       simple-prod-collector-headless   ClusterIP      None            <none>           9411/TCP,14250/TCP,14267/TCP,14268/TCP   7h5m

Unfortunately, my Jaeger Agent can't seem to speak to it still... I am actually deploying my Jaeger Agent via docker-compose and as you can see here, I am configuring it to connect to jaeger-collector.my-container-dev:80:

version: "3"

services:

  jaeger-agent:
    image: jaegertracing/jaeger-agent
    hostname: jaeger-agent
    command: ["--reporter.grpc.host-port=jaeger-collector.my-container-dev:80"]
    ports:
      - "6831:6831/udp"  # UDP | accept jaeger.thrift in compact Thrift protocol used by most current Jaeger clients
      - "5778:5778"      # HTTP | serve configs, sampling strategies
      - "14271:14271"    # HTTP | admin port: health check at / and metrics at /metrics
    restart: on-failure

I can see that something is wrong with the connection because when I hit the Jaeger Agent's Sampling Strategy service with an HTTP GET to http://localhost:5778/sampling?service=myservice, I get an error back that says the following:

collector error: rpc error: code = Unimplemented desc = Not Found: HTTP status code 404; transport: received the unexpected content-type "text/plain; charset=utf-8"

Is there something wrong with my Ingress spec? No trace data seems to be making it from my Agent to the Collector and I get errors when hitting the Jaeger Agent Sampling Service. Also, I find it a little strange that there is no IP Address listed in the kubectl get ing output but perhaps that is a red herring.

As mentioned above, I am using K3S which seems to use traefik for its ingress controller (as opposed to nginx). I checked the logs for the traefik controller and I didn't see anything helpful there either.


Solution

  • OK, I figured out the issue here which may be obvious to those with more expertise. The guide I linked to above that describes how to make an Ingress spec for gRPC is specific to NGINX. Meanwhile, I am using K3S which came out of the box with Traefik as the Ingress Controller. Therefore, the annotations I used in my Ingress spec had no affect:

    metadata:
      annotations:
        kubernetes.io/ingress.class: "nginx"
        nginx.ingress.kubernetes.io/ssl-redirect: "true"
        nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
    

    So I found another Stack Overflow post discussing Traefik and gRPC and modified my original Ingress spec above a bit to include the annotations mentioned there:

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: simple-prod-collector
      namespace: monitoring
      annotations:
        kubernetes.io/ingress.class: traefik
        ingress.kubernetes.io/protocol: h2c
        traefik.protocol: h2c
    spec:
      rules:
      - host: jaeger-collector.my-container-dev
        http:
          paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: simple-prod-collector
                port:
                  number: 14250
    

    These are the changes I made:

    1. Changed the metadata/annotations (this was the actual change needed I am sure)
    2. I also updated the spec version to use networking.k8s.io/v1 instead of networking.k8s.io/v1beta1 so there are some structural changes due to that but none of the actual content changed AFAIK.

    Hopefully this helps someone else running into this same confusion.