Search code examples
istioenvoyproxyistio-sidecar

Istio Sidecar to retry on specified status codes (503)


By default, if we don't define any VirtualService, Istio will generate something like the following Envoy route/retry configuration:

{
 "cluster": "outbound|9100||quote-svc-cip.quote.svc.cluster.local",
 "timeout": "0s",
 "retry_policy": {
  "retry_on": "connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes",
  "num_retries": 2,
  "retry_host_predicate": [
   {
    "name": "envoy.retry_host_predicates.previous_hosts"
   }
  ],
  "host_selection_retry_max_attempts": "5",
  "retriable_status_codes": [
   503
  ]
 },
 "max_grpc_timeout": "0s"
}

But if we specify our own VirtualService, e.g.:

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: book-svc-cip
  namespace: book
spec:
  hosts:
  - book-svc-cip.book.svc.cluster.local
  http:
  - retries:
      attempts: 3
      retryOn: connect-failure,refused-stream,unavailable,retriable-status-codes
    route:
    - destination:
        host: book-svc-cip.book.svc.cluster.local

The generated config will look like:

{
 "cluster": "outbound|9281||book-svc-cip.book.svc.cluster.local",
 "timeout": "0s",
 "retry_policy": {
  "retry_on": "connect-failure,refused-stream,unavailable,retriable-status-codes",
  "num_retries": 3,
  "retry_host_predicate": [
   {
    "name": "envoy.retry_host_predicates.previous_hosts"
   }
  ],
  "host_selection_retry_max_attempts": "5"
 },
 "max_grpc_timeout": "0s"
}

Notice that the retriable_status_codes is missing.

For the default, looks like it's defined in https://github.com/istio/istio/blob/1.9.0/pilot/pkg/networking/core/v1alpha3/route/retry/retry.go#L38-L39. But there is no option/field to configure retriable_status_codes via VirtualService.

How could we define the retriable_status_codes in Istio?

Update #1: My Istio version is 1.6.9. But if any newer version can support it, it's also appreciated.


Solution

  • The example in the documentation should work (according to the source code); I can verify it works on a later version.

    https://istio.io/latest/docs/reference/config/networking/virtual-service/#HTTPRetry

        retries:
          attempts: 3
          perTryTimeout: 2s
          retryOn: connect-failure,refused-stream,503
    

    The source code for 1.6.9 shows that the above example should work