I am copying multiple parquet files between cross account s3 buckets. When I am copying them to the destination bucket I want to rename the files.
import boto3
s3_client = boto3.client('s3')
s3_resource = boto3.resource('s3')
bucket = 'sourcebucket'
folder_path = 'source_folder/'
resp = s3_client.list_objects(Bucket=bucket, Prefix=folder_path)
keys = []
for obj in resp['Contents']:
keys.append(obj['Key'])
for key in keys:
copy_source ={
'Bucket': 'sourcebucket',
'Key': key
}
file_name = key.split('/')[-1]
s3_file = 'dest_folder/'+'xyz'+file_name
bucketdest = s3_resource.Bucket('destinationbucket')
bucketdest.copy(copy_source,s3_file,ExtraArgs={'GrantFullControl':'id = " "'})
This is what I have tried. I can see the files in my destination bucket with the new name but they have no actual data.
Thanks!
Your code is working perfectly fine for me! (However, I ran it without the ExtraArgs
since I didn't have an ID.)
When I copy objects between buckets, the rules I use are:
ExtraArgs={'ACL':'bucket-owner-full-control'}
I doubt this small change would have impacted the contents of the your objects.
By the way, it might be a good idea to use either Client methods or Resource methods. Mixing them can lead to confusion in code and potential problems.
So, you could use something like:
Client method:
response = s3_client.list_objects(Bucket=bucket, Prefix=source_prefix)
for object in response['Contents']:
copy_source ={
'Bucket': source_bucket,
'Key': object['Key']
}
s3_client.copy_object(
Bucket = target_bucket,
Key = 'dest_folder/' + 'xyz' + key.split('/')[-1],
CopySource = copy_source,
ACL = 'bucket-owner-full-control'
)
or you could use:
Resource method:
for object in s3_resource.Bucket(source_bucket).objects.Filter(Prefix=source_prefix):
copy_source ={
'Bucket': source_bucket,
'Key': object.key
}
s3_resource.Bucket(target_bucket).copy(
CopySource = copy_source,
Key = 'dest_folder/' + 'xyz' + key.split('/')[-1],
ExtraArgs={'ACL':'bucket-owner-full-control'}
)
(Warning: I didn't test those snippets.)