Search code examples
amazon-web-servicesperformanceamazon-s3drupalcopy

Amazon S3 - how to increase bucket to bucket file copy speed?


I'm working on video migration (PHP, drupal 10) which is copying files from one S3 bucket to another one. I'm using copyObject() method of S3FileImport class to copy files.

Problem is that I have many small files (video segments) and migration takes a lot of time. I.e. 1 video can have i.e. 500 segment files and copying all of them takes about 2 minutes and I have lot of videos also.

Question: how much would copy speed increase if source files were on the same bucket? Idea is to use some bulk copy to first move all of the files to the same bucket (drupal's local storage) and then run migration. Would that make sense?

If not, any other suggestion on how to increase copy speed?


Solution

  • The copyObject() method instructs S3 to copy the object between locations. This happens totally within AWS and does not involve downloading/uploading the data. It would operate at the same speed for objects copied in the same bucket or in different buckets (as long as the buckets are in the same region).

    The issue you are facing is that there is an overhead for each copy request -- mostly in sending the request and receiving the response. To speed-up the copy process, you can issue copy commands in parallel. That is, use multi-threading to issue more Copy commands without waiting for a response before issuing the next Copy request. That is how the AWS CLI copies files so quickly.

    The best solution, of course, is not to copy those files if possible. Perhaps you can find a way for your app to read from the actual source rather than having to copy all the objects first.