I have a SageMaker TrainingStep followed by a ProcessingStep. I need to pass the output path of the TrainingStep to the ProcessingStep.
The TrainingStep is writing the output to an S3 folder into a file named s3://mybucket/output.tar.gz
. But, as output, in the path $['ModelArtifacts']['S3ModelArtifacts']
, it is returning a wrong file name: s3://mybucket/model.tar.gz
To be able to work around this bug, I have to remove the last 12 characters of the path. So the Amazon States Language expression I need to run is:
$['ModelArtifacts']['S3ModelArtifacts'][0,-12]
I am using Python to configure my state machine, so I have the following code:
ProcessingInput(
source=
train_step.output()["$['ModelArtifacts']['S3ModelArtifacts'][0,-12]"],
destination='/opt/ml/processing/inference_output/',
input_name='inference_output'
),
Which generates the following Amazon States Language string:
"S3Input": {
"S3Uri.$": "$['$['ModelArtifacts']['S3ModelArtifacts'][0,-12]']",
"LocalPath": "/opt/ml/processing/inference_output/",
"S3DataType": "S3Prefix",
"S3InputMode": "File",
"S3DataDistributionType": "FullyReplicated",
"S3CompressionType": "None"
}
Which is not the desired result, because the desired result is to produce the following:
"S3Uri.$": "$['ModelArtifacts']['S3ModelArtifacts'][0,-12]"
How can I modify my Python code to produce the desired result?
The work around is to specify the correct S3 path as an Execution input parameter.