Search code examples
google-cloud-storageargo-workflows

Argo workflows is trying to save the artifact to /var/run/argo/outputs/artifacts. Where is this specified?


I found Argo lint today. Thank you to the Argo team!!! This is a very useful tool and has saved me tons of time. The following yaml checks out with no errors, but when I try to run it, I get the following error. How can I track down what is happening?

FATA[2022-03-14T19:36:29.512Z] workflows.argoproj.io "hello-world-s5rm5" not found

Here is the workflow

---
{
   "apiVersion": "argoproj.io/v1alpha1",
   "kind": "Workflow",
   "metadata": {
      "annotations": {
         "workflows.argoproj.io/description": "testing a linter",
         "workflows.argoproj.io/version": ">= 3.1.0"
      },
      "labels": {
         "workflows.argoproj.io/archive-strategy": "false"
      },
      "generateName": "hello-world-",
      "namespace": "sandbox"
   },
   "spec": {
      "arguments": {
         "parameters": [
            {
               "name": "msg",
               "value": "Hello there"
            }
         ]
      },
      "entrypoint": "entrypoint",
      "securityContext": {
         "fsGroup": 2000,
         "fsGroupChangePolicy": "OnRootMismatch",
         "runAsGroup": 3000,
         "runAsNonRoot": true,
         "runAsUser": 1000
      },
      "templates": [
         {
            "container": {
               "args": [
                  "cowsay Hello Test >> {{outputs.artifacts.message}}"
               ],
               "command": [
                  "sh",
                  "-c"
               ],
               "image": "docker/whalesay:latest",
               "imagePullPolicy": "IfNotPresent",
            },
            "name": "whalesay",
            "outputs": {
               "artifacts": [
                  {
                     "name": "message",
                     "path": "/tmp/output.tgz",
                     "s3": {
                        "key": "whalesay",
                     }
                  }
               ]
            },
            "retryStrategy": {
               "limit": "10"
            },
            "securityContext": {
               "fsGroup": 2000,
               "fsGroupChangePolicy": "OnRootMismatch",
               "runAsGroup": 3000,
               "runAsNonRoot": true,
               "runAsUser": 1000
            }
         },
         {
            "inputs": {
               "artifacts": [
                  {
                     "s3": {
                        "key": "whalesay",
                     },
                     "name": "data",
                     "path": "/tmp/input"
                  }
               ]
            },
            "name": "print",
            "retryStrategy": {
               "limit": "10"
            },
            "script": {
               "command": [
                  "python"
               ],
               "image": "python:alpine3.6",
               "imagePullPolicy": "IfNotPresent",
               "source": "import sys \nsys.stdout.write(\"{{inputs.artifacts.data}}\")\n\n"
            },
            "securityContext": {
               "fsGroup": 2000,
               "fsGroupChangePolicy": "OnRootMismatch",
               "runAsGroup": 3000,
               "runAsNonRoot": true,
               "runAsUser": 1000
            }
         },
         {
            "dag": {
               "tasks": [
                  {
                     "name": "whalesay",
                     "template": "whalesay"
                  },
                  {
                     "arguments": {
                        "artifacts": [
                           {
                              "from": "{{whalesay.outputs.artifacts.message}}",
                              "name": "data"
                           }
                        ]
                     },
                     "dependencies": [
                        "whalesay"
                     ],
                     "name": "print",
                     "template": "print"
                  }
               ]
            },
            "name": "entrypoint"
         }
      ]
   }
}
...

Here is the result of kubectl describe

Name:         hello-world
Namespace:    sandbox
Labels:       workflows.argoproj.io/archive-strategy=false
Annotations:  workflows.argoproj.io/description: testing a linter
              workflows.argoproj.io/version: >= 3.1.0
API Version:  argoproj.io/v1alpha1
Kind:         Workflow
Metadata:
  Creation Timestamp:  2022-03-14T19:33:19Z
  Generation:          1
  Managed Fields:
    API Version:  argoproj.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:workflows.argoproj.io/description:
          f:workflows.argoproj.io/version:
        f:labels:
          .:
          f:workflows.argoproj.io/archive-strategy:
      f:spec:
      f:status:
    Manager:         argo
    Operation:       Update
    Time:            2022-03-14T19:33:19Z
  Resource Version:  16499078
  UID:               b438cf44-241c-44bf-bb42-e470eaf4ca08
Spec:
  Arguments:
    Parameters:
      Name:    msg
      Value:   Hello there
  Entrypoint:  entrypoint
  Security Context:
    Fs Group:                2000
    Fs Group Change Policy:  OnRootMismatch
    Run As Group:            3000
    Run As Non Root:         true
    Run As User:             1000
  Templates:
    Container:
      Args:
        cowsay Hello Test >> {{outputs.artifacts.message}}
      Command:
        sh
        -c
      Image:              docker/whalesay:latest
      Image Pull Policy:  IfNotPresent
      Name:               
      Resources:
    Inputs:
    Metadata:
    Name:  whalesay
    Outputs:
      Artifacts:
        Name:  message
        Path:  /tmp/output.tgz
        s3:
          Key:  whalesay
    Retry Strategy:
      Limit:  10
    Security Context:
      Fs Group:                2000
      Fs Group Change Policy:  OnRootMismatch
      Run As Group:            3000
      Run As Non Root:         true
      Run As User:             1000
    Inputs:
      Artifacts:
        Name:  data
        Path:  /tmp/input
        s3:
          Key:  whalesay
    Metadata:
    Name:  print
    Outputs:
    Retry Strategy:
      Limit:  10
    Script:
      Command:
        python
      Image:              python:alpine3.6
      Image Pull Policy:  IfNotPresent
      Name:               
      Resources:
      Source:  import sys 
sys.stdout.write("{{inputs.artifacts.data}}")


    Security Context:
      Fs Group:                2000
      Fs Group Change Policy:  OnRootMismatch
      Run As Group:            3000
      Run As Non Root:         true
      Run As User:             1000
    Dag:
      Tasks:
        Arguments:
        Name:      whalesay
        Template:  whalesay
        Arguments:
          Artifacts:
            From:  {{whalesay.outputs.artifacts.message}}
            Name:  data
        Dependencies:
          whalesay
        Name:      print
        Template:  print
    Inputs:
    Metadata:
    Name:  entrypoint
    Outputs:
Status:
  Finished At:  <nil>
  Started At:   <nil>
Events:         <none>

UPDATE:

I have re-installed (upgraded) Argo and made some progress. The error (below) suggests that I have set up my Artifact repository wrong. I am following instructions found here to the best of my understanding.

The Google technical support folks are telling me that my GCS bucket is configured for read only. I am conversing with them on how to open the bucket for writing. Once that is done, am I correct that updating the configmap is sufficient?

https://argoproj.github.io/argo-workflows/configure-artifact- and repository/#google-cloud-storage-gcs

https://argoproj.github.io/argo-workflows/artifact-repository-ref/

Another Update:

Thanks to the Google help folks, I think I have cloud storage configured (I think), but I cannot yet confirm. I am getting the following error (full stack below.

Question: Where is the prefix "/var/run/argo/outputs/artifacts" specified? I have not encountered this before.

What is the proper way to reconcile this automatic insertion in the workflow?

 open /var/run/argo/outputs/artifacts/tmp/output.tgz.tgz: no such file or directory
                      hello-worldnztr8-4118214805 (v1:metadata.name)
      ARGO_CONTAINER_RUNTIME_EXECUTOR:    emissary
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 hello-worldnztr8
      ARGO_WORKFLOW_UID:                  4ed3e706-48d9-4d22-bf73-fcccc4a4e6d0
      ARGO_CONTAINER_NAME:                init
      ARGO_TEMPLATE:                      {"name":"whalesay","inputs":{},"outputs":{"artifacts":[{"name":"message","path":"/tmp/output.tgz","s3":{"key":"whalesay"}}]},"metadata":{},"container":{"name":"","image":"docker/whalesay:latest","command":["sh","-c"],"args":["cowsay Hello Test \u003e\u003e {{outputs.artifacts.message}}"],"resources":{},"imagePullPolicy":"IfNotPresent"},"archiveLocation":{"archiveLogs":false},"retryStrategy":{"limit":"10"},"securityContext":{"runAsUser":1000,"runAsGroup":3000,"runAsNonRoot":true,"fsGroup":2000,"fsGroupChangePolicy":"OnRootMismatch"}}
      ARGO_NODE_ID:                       hello-worldnztr8-4118214805
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tvh6k (ro)
Containers:
  wait:
    Container ID:  containerd://6593d624b0350cc51a739e19d78f39e5726a9f1dfddc7e8995b082a073f57864
    Image:         quay.io/argoproj/argoexec:v3.3.0
    Image ID:      quay.io/argoproj/argoexec@sha256:b37739320a21d1d96789082c659b96f2dcb59c51483d9852bc858f6cfddf82fb
    Port:          <none>
    Host Port:     <none>
    Command:
      argoexec
      wait
      --loglevel
      info
    State:          Terminated
      Reason:       Error
      Message:      open /var/run/argo/outputs/artifacts/tmp/output.tgz.tgz: no such file or directory
      Exit Code:    1
      Started:      Wed, 16 Mar 2022 20:42:36 +0000
      Finished:     Wed, 16 Mar 2022 20:42:37 +0000
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_POD_NAME:                      hello-worldnztr8-4118214805 (v1:metadata.name)
      ARGO_CONTAINER_RUNTIME_EXECUTOR:    emissary
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 hello-worldnztr8
      ARGO_WORKFLOW_UID:                  4ed3e706-48d9-4d22-bf73-fcccc4a4e6d0
      ARGO_CONTAINER_NAME:                wait
      ARGO_TEMPLATE:                      {"name":"whalesay","inputs":{},"outputs":{"artifacts":[{"name":"message","path":"/tmp/output.tgz","s3":{"key":"whalesay"}}]},"metadata":{},"container":{"name":"","image":"docker/whalesay:latest","command":["sh","-c"],"args":["cowsay Hello Test \u003e\u003e {{outputs.artifacts.message}}"],"resources":{},"imagePullPolicy":"IfNotPresent"},"archiveLocation":{"archiveLogs":false},"retryStrategy":{"limit":"10"},"securityContext":{"runAsUser":1000,"runAsGroup":3000,"runAsNonRoot":true,"fsGroup":2000,"fsGroupChangePolicy":"OnRootMismatch"}}
      ARGO_NODE_ID:                       hello-worldnztr8-4118214805
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tvh6k (ro)
  main:
    Container ID:  containerd://c1c3c014b6c975b5702da564f76a4f5026352bc1f3b57f7dc4d1738104ee7ab8
    Image:         docker/whalesay:latest
    Image ID:      sha256:c717279bbba020bf95ac72cf47b2c8abb3a383ad4b6996c1a7a9f2a7aaa480ad
    Port:          <none>
    Host Port:     <none>
    Command:
      /var/run/argo/argoexec
      emissary
      --
      sh
      -c
    Args:
      cowsay Hello Test >> {{outputs.artifacts.message}}
    State:          Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Wed, 16 Mar 2022 20:42:36 +0000
      Finished:     Wed, 16 Mar 2022 20:42:36 +0000
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_CONTAINER_NAME:                main
      ARGO_TEMPLATE:                      {"name":"whalesay","inputs":{},"outputs":{"artifacts":[{"name":"message","path":"/tmp/output.tgz","s3":{"key":"whalesay"}}]},"metadata":{},"container":{"name":"","image":"docker/whalesay:latest","command":["sh","-c"],"args":["cowsay Hello Test \u003e\u003e {{outputs.artifacts.message}}"],"resources":{},"imagePullPolicy":"IfNotPresent"},"archiveLocation":{"archiveLogs":false},"retryStrategy":{"limit":"10"},"securityContext":{"runAsUser":1000,"runAsGroup":3000,"runAsNonRoot":true,"fsGroup":2000,"fsGroupChangePolicy":"OnRootMismatch"}}
      ARGO_NODE_ID:                       hello-worldnztr8-4118214805
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tvh6k (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  var-run-argo:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-tvh6k:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  55s   default-scheduler  Successfully assigned default/hello-worldnztr8-4118214805 to gke-cluster-1-default-pool-d262cd84-va7g
  Normal  Pulled     55s   kubelet            Container image "quay.io/argoproj/argoexec:v3.3.0" already present on machine
  Normal  Created    54s   kubelet            Created container init
  Normal  Started    54s   kubelet            Started container init
  Normal  Pulled     53s   kubelet            Container image "quay.io/argoproj/argoexec:v3.3.0" already present on machine
  Normal  Created    53s   kubelet            Created container wait
  Normal  Started    53s   kubelet            Started container wait
  Normal  Pulled     53s   kubelet            Container image "docker/whalesay:latest" already present on machine
  Normal  Created    53s   kubelet            Created container main
  Normal  Started    53s   kubelet            Started container main

Solution

  • The complete fix is detailed here https://github.com/argoproj/argo-workflows/issues/8168#event-6261265751

    for purposes of this discussion, the output must be the explicit location (not a placeholder) e.g. /tmp/ouput

    I think the standard is that you do not put the .tgz suffix in the output location, but that is not yet confirmed as there was another fix involved. Perhaps someone from the Argo team can confirm this.