Search code examples
pythonpathlib

Use pathlib for S3 paths


I would like to build some functionality to move files between S3 and my local file system, but pathlib appears to combine repeated slashes, breaking my aws-cli functionality:

>>> from pathlib import Path

>>> str(Path('s3://loc'))
s3:/loc'

How can I manipulate S3 paths in this way?


Solution

  • You can try combining urllib.parse with pathlib.

    from urllib.parse import urlparse, urlunparse
    from pathlib import PosixPath
    
    s3_url = urlparse('s3://bucket/key')
    s3_path = PosixPath(s3_url.path)
    s3_path /= 'hello'
    s3_new_url = urlunparse((s3_url.scheme, s3_url.netloc, s3_path.as_posix(), s3_url.params, s3_url.query, s3_url.fragment))
    # or
    # s3_new_url = s3_url._replace(path=s3_path.as_posix()).geturl()
    print(s3_new_url)
    

    It's quite cumbersome, but it's what you asked for.