python-3.x amazon-web-services amazon-s3 amazon-sagemaker imread

Customer Error: imread read blank (None) image for file- Sagemaker AWS

I am following this tutorial with my custom data and my custom S3 buckets where train and validation data are. I am getting the following error:

Customer Error: imread read blank (None) image for file: /opt/ml/input/data/train/s3://image-classification/image_classification_model_data/train/img-001.png

I have all my training data are in one folder named 'train' I have set up my lst file like this suggested by doc,

22  1   s3://image-classification/image_classification_model_data/train/img-001.png
86  0   s3://image-classification/image_classification_model_data/train/img-002.png
...

My other configurations:

s3_bucket = 'image-classification'
prefix =  'image_classification_model_data'


s3train = 's3://{}/{}/train/'.format(s3_bucket, prefix)
s3validation = 's3://{}/{}/validation/'.format(s3_bucket, prefix)

s3train_lst = 's3://{}/{}/train_lst/'.format(s3_bucket, prefix)
s3validation_lst = 's3://{}/{}/validation_lst/'.format(s3_bucket, prefix)



train_data = sagemaker.inputs.TrainingInput(s3train, distribution='FullyReplicated', 
                        content_type='application/x-image', s3_data_type='S3Prefix')

validation_data = sagemaker.inputs.TrainingInput(s3validation, distribution='FullyReplicated', 
                             content_type='application/x-image', s3_data_type='S3Prefix')

train_data_lst = sagemaker.inputs.TrainingInput(s3train_lst, distribution='FullyReplicated', 
                        content_type='application/x-image', s3_data_type='S3Prefix')

validation_data_lst = sagemaker.inputs.TrainingInput(s3validation_lst, distribution='FullyReplicated', 
                             content_type='application/x-image', s3_data_type='S3Prefix')


data_channels = {'train': train_data, 'validation': validation_data, 'train_lst': train_data_lst, 
                 'validation_lst': validation_data_lst}

I checked the images downloaded and checked physically, I see the image. Now sure what this error gets thrown out as blank. Any suggestion would be great.

Solution

Sagemaker copies the input data you specify in s3train into the instance in /opt/ml/input/data/train/ and that's why you have an error, because as you can see from the error message is trying to concatenate the filename in the lst file with the path where it expect the image to be. So just put only the filenames in your lstand should be fine (remove the s3 path).