Search code examples
kubernetesargo-workflowsargoproj

How to reference sys.stdout created outside of a DAG to be used inside of a DAG with withParam?


I am working with an Argo workflow.

There is a DAG step in my entrypoint which follows several normal steps. One of these steps does a sys.stdout. Once inside of the DAG step, I want some of the tasks to reference the results from the sys.stdout.

I know if we wanted to reference the sys.stdout when the workflow just goes from one step to the next (without the DAG), we can do {{steps.step-name.outputs.result}}. The same does not work inside of a DAG task though.

How can I reference the sys.stdout inside of a DAG task so I can use it with withParam?

Edit:

The workflow looks like the following:

  templates:
  - name: the-entrypoint
    steps:
    - - name: step01
        template: first-step
    - - name: step02
        template: second-step
    - - name: step03
        template: third-step
    - - name: step04-the-dag-step
        template: fourth-step

In general, if third-step does a sys.stdout, we can reference it by {{steps.step03.outputs.result}} in fourth-step. However, in this case fourth-step is a DAG, and if one of the DAG tasks wants to use the sys.stdout, calling {{steps.step03.outputs.result}} as an argument/parameter inside of DAG tasks throws up an error.

The question is then how can one correctly reference the sys.stdout generated by third-step inside fourth-step DAG tasks?


Solution

  • A bit of background about template outputs

    Argo Workflows supports a number of different types of templates.

    Each type of template supports different types of reference within the template.

    Within a steps template, you may access the output parameters of other steps with steps.step-name.outputs.parameters.param-name (for named parameters) or steps.step-name.outputs.result (for the stdout of a script or container template).

    Example (see full Workflow):

      - name: output-parameter
        steps:
        - - name: generate-parameter
            template: whalesay
        - - name: consume-parameter
            template: print-message
            arguments:
              parameters:
              - name: message
                value: "{{steps.generate-parameter.outputs.parameters.hello-param}}"
    

    Within a dag template, you may access the output of various tasks using a similar syntax, just using tasks. instead of steps..

    Example (see full Workflow):

        - name: main
          dag:
            tasks:
              - name: flip-coin
                template: flip-coin
              - name: heads
                depends: flip-coin
                template: heads
                when: "{{tasks.flip-coin.outputs.result}} == heads"
              - name: tails
                depends: flip-coin
                template: tails
                when: "{{tasks.flip-coin.outputs.result}} == tails"
    

    Within a container or script template, you may access only the inputs of that template*. You may not directly access the outputs of steps or tasks from steps or tasks templates from a container or script template.

    Referencing a step output from a DAG

    As mentioned above, a DAG template cannot directly reference step outputs from a steps template. But a step within a steps template can pass a step output to a DAG template.

    In your example, it would look something like this:

      templates:
      - name: the-entrypoint
        steps:
        - - name: step01
            template: first-step
        - - name: step02
            template: second-step
        - - name: step03
            template: third-step
        - - name: step04-the-dag-step
            template: fourth-step
            arguments:
              parameters:
              - name: some-param
                value: "{{steps.step03.outputs.result}}"
      - name: fourth-step
        inputs:
          parameters:
          - name: some-param
        dag:
          tasks:
            # use the input parameter in the fourth-step template with "{{inputs.parameters.some-param}}"
    

    tl;dr

    steps. and tasks. variables are meant to be referenced within a single steps- or tasks-template, but they can be explicitly passed between templates. If you need to use the output of a step in a DAG, directly pass that output as an argument where the DAG is invoked.

    In your case, the DAG template is invoked as the last of four steps, so that is where you will pass the argument.

    * Okay, you also have access to various other variables from within a script or container template, but you don't have access to variables that are scoped as internal variables within another template.