The following code snippet is inspired by this.
hyperparameters = {
"max_depth":"5",
"eta":"0.2",
"gamma":"4",
"min_child_weight":"6",
"subsample":"0.7",
"objective":"reg:squarederror",
"num_round":"10"}
output_path = 's3://{}/{}/output'.format(s3_bucket_name, s3_prefix)
estimator = sagemaker.estimator.Estimator(image_uri=sagemaker.image_uris.retrieve("xgboost", region_name, "1.2-2"),
hyperparameters=hyperparameters,
role=role,
instance_count=1,
instance_type='ml.m5.2xlarge',
volume_size=1, # 1 GB
output_path=output_path)
estimator.fit({'train': s3_input_train, 'validation': s3_input_val})
It works fine. I was trying to use:
training_image_name = image_uris.retrieve(framework='xgboost', region=region_name, version='latest')
instead of:
sagemaker.image_uris.retrieve("xgboost", region_name, "1.2-2")
to (I believe) get hold of the latest training image but reg:squarederror is not supported? Is my code to get hold of the latest image name incorrect?
Using "latest" it not suggested as per documentation(see note): https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-algo-docker-registry-paths.html
Use specific versions as they are more stable.