While specifying s3 path
in AWS Glue Crawler
, can we mention some patterns to make the crawler read the files only with specific names in s3 folder
instead of reading every file in the path?
Something like
s3://sample_folder/sample_file%pattern%.csv.
Unfortunately, Glue doesn't support regex for inclusion filters. You can specify a folder path and set exclusion rules instead. For example, the path is s3://sample_folder
and exclusion pattern *.{txt,avro}
to filter out all txt and avro files.
See Include and Exclude Patterns for more details.