Search code examples
rate-limitingretry-logicargo-workflows

Retrying after a settable delay in Argo Workflows


One of our Argo Workflow steps may hit a rate limit and I want to be able to tell argo how long it should wait until the next retry.

Is there a way to do it?

I've seen Retries on the documentation but it only talks about retry count and backoff strategies and it doesn't look like it could be parameterized.


Solution

  • As far as I know there's no built-in way to add a pause before the next retry.

    However, you could build your own with Argo's exit handler feature.

    apiVersion: argoproj.io/v1alpha1
    kind: Workflow
    metadata:
      generateName: exit-handler-with-pause-
    spec:
      arguments:
        parameters
        - name: pause-before-retry-seconds
          value: "60"
      entrypoint: intentional-fail
      onExit: exit-handler
      - name: intentional-fail
        container:
          image: alpine:latest
          command: [sh, -c]
          args: ["echo intentional failure; exit 1"]
      - name: exit-handler
        steps:
        - - name: pause
            template: pause
            when: "{{workflow.status}} != Succeeded"
      - name: pause
        container:
          image: alpine:latest
          env:
          - name: SECONDS
            value: "{{workflow.parameters.pause-before-retry-seconds}}"
          command: [sh, -c]
          args:
          - >-
            echo "Pausing before retry..."
            sleep "$SECONDS"
    

    If the retry pause needs to be calculated within the workflow, check out the exit handler with params example.