Search code examples
sqlbusiness-intelligenceprestoamazon-athenaamazon-quicksight

How to work with a monthly interval of the last month in Athena (Presto)?


I am developing a query that will run on AWS QuickSight, getting information from our database in AWS Athena monthly (first day of last month until the last day of the same last month).

In order to do that, I set up a interval that the query take the date from Athena and used it in a WHERE clause, like that:

SELECT
date_format(date_parse(REPLACE(eventtimestamp, '  ', ' 0'), '%a %b %d %T %Y'), '%Y-%m') AS month,
COUNT(DISTINCT(SUBSTR(UPPER(REPLACE(REPLACE(r.stationid, ':', ''), '-', '')),1, 12))) as qty_uniq_aps,
count(*) AS qtd_of_sessions
FROM example.tableexample_parquet r
WHERE r.dt >= from_iso8601_date(CONCAT(CAST(extract(year FROM current_date) AS varchar(4)),
              '-',
              CAST(extract(month FROM current_date) AS varchar(2)),
              '-01'
))
GROUP BY  1
ORDER BY  1;

Looking at:

WHERE r.dt >= from_iso8601_date(CONCAT(CAST(extract(year FROM current_date) AS varchar(4)),
                  '-',
                  CAST(extract(month FROM current_date) AS varchar(2)),
                  '-01'
    ))

This is an example: this code will take the interval of actual_year-actual_month-first_day_of_month like this:

2020-07-01 >= the newest registry in database

I really want to make it smart. I want to transform this in a Where clause with Between, like this:

WHERE r.dt BETWEEN actual_year-last_month-first_day_of_month AND actual_year-last_month-last_day_of_month

The problem is: A) The last day of month will vary with the month (some months has 30 days, others 31 and another 28 or 29, for example). B) I found a function called last_day_of_month(x) in this link https://prestosql.io/docs/current/functions/datetime.html but isn't working on Athena.

I want to ask you some ideas about how to solve this. I searched in PrestoDB docs functions related with Date and Time but I don't have any idea how to make this without the last_day_of_month(x). Even with this answer, I can use where clause with r.dt >= the first day of last month and make the work to run the query for QuickSight in the first day of month. But I am looking for a decoupled solution. Thanks!


Solution

  • If you actually want to query the whole month, you only need to compare the year and month (no need to know days at all), so you should compare the "string" of year and month, and making sure the month is always two digits (e.g. 07). This will do the job:

    WHERE date_format ( r.dt, '%Y-%m' ) = date_format ( current_date, '%Y-%m' )
    

    If you want to run the query on previous month, but you are already now in the new month you should subtract some days from current time to get the previous month (and perhaps year if you are on last month of year..).

    So instead of current_date write:

    date_add('day', -7, current_date)