I am trying to download a dataset hosted in AWS.
I am trying to use s3cmd
and configured it with my access key and secret key.
I can list the the files in the bucket properly using:
s3cmd ls s3://yahoo-webscope/I3set13/
I used get
to download the dataset:
s3cmd get --recursive s3://yahoo-webscope/I3set13/
But the following error is shown:
ERROR: S3 error: 403 (Forbidden)
A few solutions I found suggested to change the bucket policy, but I can't change it because I am not the owner.
Please let me know the reason behind the problem and how I can solve it.
According to https://multimediacommons.wordpress.com/yfcc100m-core-dataset/, although the dataset is hosted in an S3 bucket, access to it is restricted, so you need to submit a request and follow further email instructions for access:
Getting the YFCC100M: The dataset can be requested at Yahoo Webscope. You will need to create a Yahoo account if you do not have one already, and once logged in you will find it straightforward to submit the request for the YFCC100M. Webscope will ask you to tell them what your plans are with the dataset, which helps them justify the existence of their academic outreach program and allows them to keep offering datasets in the future. Unlike other datasets available at Webscope, the YFCC100M does not require you to be a student or faculty at an accredited university, so you will be automatically approved.