I am following this tutorial with my custom data and my custom S3 buckets where train and validation data are. I am getting the following error:
Customer Error: imread read blank (None) image for file: /opt/ml/input/data/train/s3://image-classification/image_classification_model_data/train/img-001.png
I have all my training data are in one folder named 'train
' I have set up my lst
file like this suggested by doc,
22 1 s3://image-classification/image_classification_model_data/train/img-001.png
86 0 s3://image-classification/image_classification_model_data/train/img-002.png
...
My other configurations:
s3_bucket = 'image-classification'
prefix = 'image_classification_model_data'
s3train = 's3://{}/{}/train/'.format(s3_bucket, prefix)
s3validation = 's3://{}/{}/validation/'.format(s3_bucket, prefix)
s3train_lst = 's3://{}/{}/train_lst/'.format(s3_bucket, prefix)
s3validation_lst = 's3://{}/{}/validation_lst/'.format(s3_bucket, prefix)
train_data = sagemaker.inputs.TrainingInput(s3train, distribution='FullyReplicated',
content_type='application/x-image', s3_data_type='S3Prefix')
validation_data = sagemaker.inputs.TrainingInput(s3validation, distribution='FullyReplicated',
content_type='application/x-image', s3_data_type='S3Prefix')
train_data_lst = sagemaker.inputs.TrainingInput(s3train_lst, distribution='FullyReplicated',
content_type='application/x-image', s3_data_type='S3Prefix')
validation_data_lst = sagemaker.inputs.TrainingInput(s3validation_lst, distribution='FullyReplicated',
content_type='application/x-image', s3_data_type='S3Prefix')
data_channels = {'train': train_data, 'validation': validation_data, 'train_lst': train_data_lst,
'validation_lst': validation_data_lst}
I checked the images downloaded and checked physically, I see the image. Now sure what this error gets thrown out as blank
. Any suggestion would be great.
Sagemaker copies the input data you specify in s3train
into the instance in /opt/ml/input/data/train/
and that's why you have an error, because as you can see from the error message is trying to concatenate the filename in the lst
file with the path where it expect the image to be. So just put only the filenames in your lst
and should be fine (remove the s3 path).