We have a query which is performing an aggregation, like: SELECT t.date, COUNT(*) AS rec_count FROM our_schema.log_data t WHERE t.date BETWEEN '2011-01-01' AND '2012-01-01' GROUP BY t.date;
I know we can get the quantity of data scanned by this query, but I would like to get the list of DISTINCT files that this query has accessed in S3.
I've been digging around for a bit, but can't find a system view that would give that information, and I'm not sure how we could use $path with the query to get the info.
Thoughts?
STL_S3LIST should have what you need (per comment exchange).