Search code examples
node.jsamazon-web-servicesaws-step-functions

AWS State Machine Choice Wildcards


I am iterating through an s3 bucket for all objects. I need to process all files that have the .json extension anywhere in the prefix. For example;

  1. /x/
  2. /x/1.json
  3. /x/2.json
  4. /x/y/
  5. /x/y/1.json
  6. /x/y/2.json
  7. /x/y/z/

I have trying to use a wildcard in my state machine choice to only go to the next task for processing, only if it is a json file. If it doesn't, I want to move to the next iteration until I get a json file. Below is my "choice" in my state machine. When the step function runs it greys out on TraversalChoice, so I can only imagine I am doing something wrong when I define the wildcard. I would really appreciate it if someone could point me in the right direction. Thanks!

 "Traversal": {
              "Type": "Task",
              "Resource": "arn for lambda that get objects",
              "Parameters": {
                "NextContinuationToken.$": "$.traversal.NextContinuationToken"
              },
              "ResultPath": "$.traversal",
              "Next": "TraversalChoice"
            },
            "TraversalChoice": {
              "Type": "Choice",
              "Choices": [{
                "Not": {
                  "Variable": "$.traversal.Files.Key",
                  "StringMatches": "x/*.json"
                },
                "Next": "Traversal"
              }],
              "Default": "lambdaToProcess"
            },

For context here is the code that retrieves all the objects.

 let params = {
        Bucket: bucket,
        MaxKeys: 1,
        ContinuationToken: event.NextContinuationToken || null
    };

if (prefix) params.Prefix = prefix;
try {

    let response = await s3.listObjectsV2(params).promise();

    return {
        Files: response.Contents,
        NextContinuationToken: response.NextContinuationToken || ""
    }

Solution

  • I figured it out. It was a dumb error on my part. The problem was that Files was returning an array depending on what number you set MaxKeys to.

    "Variable": "$.traversal.Files.Key"
    

    Should have been...

    "Variable": "$.traversal.Files[0].Key"
    

    In the end because I only intend to process one file at a time I altered my code to return the key traversal.File.

     "TraversalChoice": {
                  "Type": "Choice",
                  "Choices": [{
                    "Not": {
                      "Variable": "$.traversal.File",
                      "StringMatches": "*.json"
                    },
                    "Next": "Traversal"
                  }],
                  "Default": "lambdaToProcess"
                },