Search code examples
amazon-web-servicesamazon-s3gcloudgsutil

Access Denied when copying Amazon S3 files using gsutil - fine using aws s3 cp


Similar to this issue here I'm getting Access Denied when trying to copy files from an Amazon S3 bucket to a Google Storage bucket.

I've also broken it down into two parts, copying from S3 > local machine - which also fails. hen I run gsutil -D cp s3://my-bucket/filename.txt ./test-copy I get the following:

    gslib.cloud_api.AccessDeniedException: AccessDeniedException: 403 AccessDenied
    <?xml version="1.0" encoding="UTF-8"?>
    <Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>**REDACTED**</RequestId><HostId>**REDACTED**bc=</HostId></Error>

What I've tried:

  • Using a different AWS Access key (with full AWS permissions to the S3 bucket for overkill!)
  • When I run aws s3 cp s3://my-bucket/filename.txt ./test-copy it works fine, so definitely no permissions issues on AWS
  • Checked the gcloud config I'm using - which is the correct one with and checked permissions to the Google Cloud storage bucket (added on Storage Admin for overkill)

The fact than I can copy using aws s3 cp confirms the AWS access key I'm using has permissions to copy from that S3 bucket. It feels to me like an issue with python/gsutil, the versions on my laptop:

  • gsutil version: 5.20
  • Python 3.10.8

Here's more of the DEBUG log snippet if it helps:

Zero length chunk ignored
reply: 'HTTP/1.1 403 Forbidden\r\n'
header: x-amz-request-id: **REDACTED**
header: x-amz-id-2: **REDACTED**
header: Content-Type: application/xml
header: Transfer-Encoding: chunked
header: Date: Tue, 07 Mar 2023 15:35:40 GMT
header: Server: AmazonS3
DEBUG 0307 15:35:41.384847 connection.py] Response headers: [('x-amz-request-id', '**REDACTED**'), ('x-amz-id-2', '**REDACTED**'), ('Content-Type', 'application/xml'), ('Transfer-Encoding', 'chunked'), ('Date', 'Tue, 07 Mar 2023 15:35:40 GMT'), ('Server', 'AmazonS3')]
DEBUG: Exception stack trace:
    Traceback (most recent call last):
      File "/Users/**REDACTED**/Documents/google-cloud-sdk/platform/gsutil/gslib/boto_translation.py", line 718, in _PerformResumableDownload
        key.get_file(fp,
    TypeError: Key.get_file() got an unexpected keyword argument 'hash_algs'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/Users/**REDACTED**/Documents/google-cloud-sdk/platform/gsutil/gslib/boto_translation.py", line 590, in GetObjectMedia
        self._PerformResumableDownload(download_stream,
      File "/Users/**REDACTED**/Documents/google-cloud-sdk/platform/gsutil/gslib/boto_translation.py", line 725, in _PerformResumableDownload
        key.get_file(fp,
      File "/Users/**REDACTED**/Documents/google-cloud-sdk/platform/gsutil/gslib/vendored/boto/boto/s3/key.py", line 1500, in get_file
        self._get_file_internal(fp, headers=headers, cb=cb, num_cb=num_cb,
      File "/Users/**REDACTED**/Documents/google-cloud-sdk/platform/gsutil/gslib/vendored/boto/boto/s3/key.py", line 1536, in _get_file_internal
        self.open('r', headers, query_args=query_args,
      File "/Users/**REDACTED**/Documents/google-cloud-sdk/platform/gsutil/gslib/vendored/boto/boto/s3/key.py", line 357, in open
        self.open_read(headers=headers, query_args=query_args,
      File "/Users/**REDACTED**/Documents/google-cloud-sdk/platform/gsutil/gslib/vendored/boto/boto/s3/key.py", line 317, in open_read
        raise provider.storage_response_error(self.resp.status,
    boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden
    <?xml version="1.0" encoding="UTF-8"?>
    <Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>**REDACTED**</RequestId><HostId>**REDACTED**</HostId></Error>
    

Solution

  • I've got this to work when I came across this Stackoverflow post, and realised that it's likely that the .boto config on the github runner (and also my local config) isn't set up correctly. Locally it's possible to edit ~/.boto manually and add the values in so that gsutil knows what to use e.g.:

    [Credentials]
    aws_access_key_id = {KEY_ID}
    aws_secret_access_key = {SECRET_ACCESS_KEY}
    

    but the one I found useful in my github action is to prefix the gsutil command:

    AWS_ACCESS_KEY_ID=XXXXXXXX AWS_SECRET_ACCESS_KEY=YYYYYYYY gsutil -m cp s3://bucket-name/filename gs://bucket-name
    

    More details on googles official docs here: https://cloud.google.com/storage/docs/boto-gsutil