I am trying to sync two s3 buckets:
s4cmd --dry-run sync s3://cgl-rnaseq-recompute-fixed/gtex s3://rnaseq.toil.20k/gtex
But I am getting the following error:
[Exception] An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
[Thread Failure] An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
The source bucket is publicly available. The second bucket is mine and I have access to it:
[centos@ip-172-30-3-12 data]$ s4cmd ls s3://rnaseq.toil.20k/
DIR s3://rnaseq.toil.20k/gtex/
DIR s3://rnaseq.toil.20k/pnoc/
DIR s3://rnaseq.toil.20k/target/
DIR s3://rnaseq.toil.20k/tcga/
Also I cannot ls
on the source bucket using s4cmd
but I can using s3cmd
:
[centos@ip-172-30-3-12 data]$ s4cmd ls s3://cgl-rnaseq-recompute-fixed/gtex
[Exception] An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
[Thread Failure] An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied
[centos@ip-172-30-3-12 data]$ s3cmd ls --requester-pays s3://cgl-rnaseq-recompute-fixed/gtex
DIR s3://cgl-rnaseq-recompute-fixed/gtex/
2016-06-03 17:02 435553 s3://cgl-rnaseq-recompute-fixed/gtex-manifest
What could be going wrong? Any suggestions would be much appreciated.
To achieve the s3cmd behavior, use wildcards:
s4cmd sync s3://bucket/path/dirA/* s3://bucket/path/dirB/
Note s4cmd doesn't support dirA without trailing slash indicating dirA/* as what rsync supported.
So in you case you have to use.
s4cmd --dry-run sync s3://cgl-rnaseq-recompute-fixed/gtex/* s3://rnaseq.toil.20k/gtex
Check this documentation for s4cmd it is very helpful.