amazon-web-services aws-glue amazon-athena

How does AWS Athena scale with data scanned size?

I have table with S3 JSON as a source partitioned by:

year
month
day
hour

With projection.enabled = true and standard ranges for these partition keys. Running query like:

SELECT count(*) FROM my_table WHERE year=2022 and month=10 and day=28 or day=29 or day=30

Took:

8 seconds for one day,
25 seconds for two days,
48 seconds for three days

How can I predict how will this scale?Initially I expected the time to be constant - I thought Athena would spin up as many "crawlers" as many files there are to be scanned.

Can I predict how will this scale?

Solution

While it is very hard to predict how Athena scales, I can say that V3 engine works much faster than V2 engine.