I'm running a Sagemaker pipeline with 2 steps, tuning and then training. The purpose is the get the best hyperparameter with tuning, and then use those hyperparameters in the next training step.
I am aware that I can use HyperparameterTuningJobAnalytics
to retrieve the tuning job specs after the tuning. However, I want to be able to use the hyperparameters like dependency and pass them directly to next trainingStep's estimator, see code below:
hyperparameters=step_tuning.properties.BestTrainingJob.TunedHyperParameters,
But this doesn't work with this error msg: AttributeError: 'PropertiesMap' object has no attribute 'update'
tf_estimator_final = TensorFlow(entry_point='./train.py',
role=role,
sagemaker_session=sagemaker_session,
code_location=code_location,
instance_count=1,
instance_type="ml.p3.16xlarge",
framework_version='2.4',
py_version="py37",
base_job_name=base_job_name,
output_path=model_path, # if output_path not specified,
hyperparameters=step_tuning.properties.BestTrainingJob.TunedHyperParameters,
model_dir="/opt/ml/model",
script_mode=True
)
step_train = TrainingStep(
name=base_job_name,
estimator=tf_estimator_final,
inputs={
"train": TrainingInput(
s3_data=train_s3
)
},
depends_on = [step_tuning]
)
pipeline = Pipeline(
name=jobname,
steps=[
step_tuning,
step_train
],
sagemaker_session=sagemaker_session
)
json.loads(pipeline.definition())
Any suggestions?
This can't be done in SageMaker Pipelines at the moment.