Search code examples
mysqlamazon-web-servicesamazon-s3aws-glueamazon-athena

Pass in list from S3 to Athena query


I've used AWS Glue to aggregate data from an S3 bucket, and am now using Athena to query that data. Can I now pass a separate file from S3 to query against my dataset?

For example, if I had the following query:

SELECT city,population,coordinates
FROM "Country"."Japan" 
WHERE city IN ('Tokyo','Kyoto','Osaka');

I want to place a .txt file in S3 that lists ('Tokyo','Kyoto','Osaka'), so that my query would look something like:

SELECT city,population,coordinates
FROM "Country"."Japan" 
WHERE city IN << some file path in s3 >>;

Solution

  • That is not part of the SQL standard.

    However, you could create another table in Amazon Athena that contains a list of cities. The LOCATION parameter of that table (which points to the data) could point to the text file with the name of a city on each line. You could then use:

    WHERE city IN (SELECT city FROM city_table)
    

    This way, all the information is 'within' Athena, but Athena actually goes to S3 to read the contents of the table.