Search code examples
linkerd

Canary rollouts with linkerd and argo rollouts


I'm trying to configure a canary rollout for a demo, but I'm having trouble getting the traffic splitting to work with linkerd. The funny part is I was able to get this working with istio and i find istio to be much more complicated then linkerd.

I have a basic go-lang service define like this:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: fish
spec:
  [...]
  strategy:
    canary:
      canaryService: canary-svc
      stableService: stable-svc
      trafficRouting:
        smi: {}
      steps:
      - setWeight: 5
      - pause: {}
      - setWeight: 20
      - pause: {}
      - setWeight: 50
      - pause: {}
      - setWeight: 80
      - pause: {}
---
apiVersion: v1
kind: Service
metadata:
  name: canary-svc
spec:
  selector:
    app: fish
  ports:
    - name: http
      port: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: stable-svc
spec:
  selector:
    app: fish
  ports:
    - name: http
      port: 8080
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: fish
  annotations:
    kubernetes.io/ingress.class: 'nginx'
    cert-manager.io/cluster-issuer: letsencrypt-production
    cert-manager.io/acme-challenge-type: dns01
    external-dns.alpha.kubernetes.io/hostname: fish.local
    nginx.ingress.kubernetes.io/enable-cors: "true"
    nginx.ingress.kubernetes.io/cors-allow-methods: "PUT, GET, POST, OPTIONS"
    nginx.ingress.kubernetes.io/cors-allow-origin: "*"
    nginx.ingress.kubernetes.io/configuration-snippet: |
      proxy_set_header l5d-dst-override $service_name.$namespace.svc.cluster.local:$service_port;
spec:
  rules:
    - host: fish.local
      http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: stable-svc
              port:
                number: 8080

When I do the deploy (sync) via ArgoCD I can see the traffic split is 50/50:

- apiVersion: split.smi-spec.io/v1alpha2
  kind: TrafficSplit
  metadata:
    [...]
    name: fish
    namespace: default
  spec:
    backends:
    - service: canary-svc
      weight: "50"
    - service: stable-svc
      weight: "50"
    service: stable-svc

However doing a curl command in a while loop i only get back the stable-svc. The only time i see a change is after I have completely moved the service to 100%.

I tried to follow this: https://argoproj.github.io/argo-rollouts/getting-started/smi/

Any help would be greatly appreciated.

Thanks


Solution

  • So there's a bit more context in this issue but the TL;DR is ingresses tend to target individual pods instead of the service address. Putting Linkerd's proxy in ingress mode tells it to override that behaviour. NGINX does already have a setting that will let it hit services instead of endpoints directly, you can see that in their docs here.