Search code examples
amazon-web-servicesaws-glueamazon-athena

Partitioning using a substring of the s3 path


We have a s3 bucket which has files in the following format:

6ugdasznp56o_2020-09-04T140000_6081c358e0417bdd81284b0cf7a6b321_2853a9.csv.gz

Is it possible to define a storage.location.template as follows:

6ugdasznp56o_${year}-${month}-${date}T${hour}0000_6081c358e0417bdd81284b0cf7a6b321_2853a9.csv.gz

to partition my files in this s3 bucket?


Solution

  • Partitioning can't happen on a file level but only on a folder structure, see also Table Location in Amazon S3:

    Do not use any of the following items for specifying the LOCATION for your data.

    • Do not use filenames, underscores, wildcards, or glob patterns for specifying file locations.

    Examples that won't work:

    ...
    s3://path_to_bucket/mySpecialFile.dat
    s3://bucketname/prefix/filename.csv
    ...