Search code examples
amazon-web-servicesamazon-s3parquetpyarrow

Using aws profile with fs S3Filesystem


Trying to use a specific AWS profile when using Apache Pyarrow. The documentation show no option to pass a profile name when instantiating S3FileSystem using pyarrow fs [https://arrow.apache.org/docs/python/generated/pyarrow.fs.S3FileSystem.html]

Tried to get around this by creating a session with boto3 and using that :

# include mfa profile
session = boto3.session.Session(profile_name="custom_profile")

# create filesystem with session
bucket = fs.S3FileSystem(session_name=session)

bucket.get_file_info(fs.FileSelector('bucket_name', recursive=True))

but this too fails :

OSError: When listing objects under key '' in bucket 'bucket_name': AWS Error [code 15]: Access Denied

is it possible to use fs with custom aws profile ?

~/.aws/credentials :

[default]
aws_access_key_id = <access_key>
aws_secret_access_key = <secret_key>

[custom_profile]
aws_access_key_id = <access_key>
aws_secret_access_key = <secret_key>
aws_session_token = <token>

additional context : all actions of users require MFA. custom AWS profile in credentials file stores token generated post MFA based authentication on the CLI, need to use that profile in the script


Solution

  • one can specify a token, but must also specify access key and secret key :

    s3 = fs.S3FileSystem(access_key="", 
                         secret_key="",
                         session_token="")
    

    one would also have to implement some method to parse the ~/.aws/credentials file to get access to these values or do it manually each time