Search code examples
pythonamazon-web-servicesamazon-s3boto3aws-glue

AWS Glue Python FileNotFoundError: [Errno 2] No such file or director


I am trying to use AWS Glue to move files between cross account S3 buckets. I am using Glue with python shell. I have list and get object permissions on the source bucket. I am able to list all the files but when I try to load files to the destination bucket I am getting the error "FileNotFoundError: [Errno 2] No such file or directory: 'test/f1=x/type=b/file1.parquet'

The files on the source s3 have partitions:

test/f1=x/type=a/file1.parquet
test/f1=x/type=a/file2.parquet
test/f1=x/type=b/file1.parquet
test/f1=x/type=b/file2.parquet

I am only trying to load files with f1=x and type=b

import pandas as pd 
import boto3
         
client = boto3.client('s3')
bucket = 'mysourcebucketname' 
folder_path = 'test/f1=x/type=b/'
       
def my_keys(bucket,folder_path):
    keys = []
    resp = client.list_objects(Bucket=bucket, Prefix=folder_path)
    for obj in resp['Contents']:
        keys.append(obj['Key'])
    return keys
           
files = my_keys(bucket,folder_path)
#print(files)
     
for file in files:
    bucketdest = 'mydestinationbucket'
    new_file_name = file.split('/')[-1]
    s3_file = 'destfolder1/destfolder2/'+"typeb"+new_file_name
    client.upload_file(file,bucketdest,s3_file,ExtraArgs={'GrantFullControl':'id =""})

Solution

  • upload_file is for uploading from local drive to S3. So your code is looking for a local file called test/f1=x/type=b/file1.parquet, which obviously does not exist, because it is on S3 as you wrote. Maybe you want to download these files instead?