Search code examples
pythonamazon-web-servicesaws-step-functions

AWS Step Functions: returning a substring of a placeholder in Python


I have a SageMaker TrainingStep followed by a ProcessingStep. I need to pass the output path of the TrainingStep to the ProcessingStep.

The TrainingStep is writing the output to an S3 folder into a file named s3://mybucket/output.tar.gz. But, as output, in the path $['ModelArtifacts']['S3ModelArtifacts'], it is returning a wrong file name: s3://mybucket/model.tar.gz

To be able to work around this bug, I have to remove the last 12 characters of the path. So the Amazon States Language expression I need to run is:

$['ModelArtifacts']['S3ModelArtifacts'][0,-12]

I am using Python to configure my state machine, so I have the following code:

ProcessingInput(
  source=
    train_step.output()["$['ModelArtifacts']['S3ModelArtifacts'][0,-12]"], 
  destination='/opt/ml/processing/inference_output/',
  input_name='inference_output'
),

Which generates the following Amazon States Language string:

        "S3Input": {
          "S3Uri.$": "$['$['ModelArtifacts']['S3ModelArtifacts'][0,-12]']",
          "LocalPath": "/opt/ml/processing/inference_output/",
          "S3DataType": "S3Prefix",
          "S3InputMode": "File",
          "S3DataDistributionType": "FullyReplicated",
          "S3CompressionType": "None"
        }

Which is not the desired result, because the desired result is to produce the following:

"S3Uri.$": "$['ModelArtifacts']['S3ModelArtifacts'][0,-12]"

How can I modify my Python code to produce the desired result?


Solution

  • The work around is to specify the correct S3 path as an Execution input parameter.