Search code examples
amazon-web-servicesamazon-s3aws-datasync

AWS Datasync S3 -> S3 cross account, confused about destination role/account


I want to use Datasync to copy data from a single S3 bucket in one account to a single S3 bucket in another account. I'm following this official AWS Datasync blog: https://aws.amazon.com/blogs/storage/how-to-use-aws-datasync-to-migrate-data-between-amazon-s3-buckets/ in the second section "Copying objects across accounts".

I've set up the source and destination buckets, and

  • done the initial steps to "Create a new IAM role and attach a new IAM policy for the source S3 bucket location" and
  • "Add the following trust relationship to the IAM role" (you can see where I mean in the blog by searching for those strings in quotes) but
  • I'm now confused about which account to use to "Open the source S3 bucket policy and apply the following policy to grant permissions for the IAM role to access the objects" and
  • which account to use to run the AWS CLI command "aws sts get-caller-identity" and
  • then the "aws datasync create-location-s3" command straight after that. Am I doing those on the source or destination accounts?

The blog is a bit confusing and unclear on those specific steps and I can't find a simpler guide anywhere.


Solution

  • The source S3 bucket policy is attached to the source S3 bucket, so you'll need to log into the source account to edit that.

    The next steps have to be done from the CLI. The wording is a bit ambiguous but the key phrase is "ensure you’re using the same IAM identity you specified in the source S3 bucket policy created in the preceding step." The IAM identity referenced in the example S3 bucket policy is arn:aws:iam::DEST-ACCOUNT-ID:role/DEST-ACCOUNT-USER so you need to be authenticated to the destination account for the CLI steps. The aws sts get-caller-identity command just returns the identity used to execute the command, so it's there to confirm that you're using the expected identity rather than being strictly required for setting up the datasync location.

    It's not explicitly mentioned in the tutorial but of course the user in the destination account needs appropriate IAM permissions to create the datasync locations and task.

    It may help to think of it this way: you need to allow a role in the destination account to access the bucket in the source account, then you're setting up the Datasync locations and tasks in the destination account. So anything related to Datasync config needs to happen in the destination account.