Search code examples
sqlgroupingimpala

Grouping a datetime by a specific amount of minutes in SQL/Impala


I'm using Impala. My table has separate columns for the date (in the format YYYYMMDD as STRING), hour and minutes (as INT) For example, I get:

date_part    hour_part    minute_part    kpi
20200828     10           10             23
20200828     10           15             64
20200828     10           20             18

To combine these columns, I have used the following syntax:

MINUTES_ADD(HOURS_ADD(TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP(CAST(date_part AS STRING), "yyyyMMdd"))), hour_part), minute_part) AS date_time

Therefore I get:

date_time              kpi
2020-08-28 10:10:00    23
2020-08-28 10:15:00    64
2020-08-28 10:20:00    18

(please note that my data only is populated each 5 minutes bin)

Now, if I want to calculate a AVG(kpi) over 1 hour, I can do the following GROUP BY:

SELECT
    MINUTES_ADD(HOURS_ADD(TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP(CAST(date_part AS STRING), "yyyyMMdd"))), hour_part), minute_part) AS date_time,
    AVG(kpi)
FROM table
GROUP BY
    DATE_PART('year', date_time),
    DATE_PART('month', date_time),
    DATE_PART('day', date_time),
    DATE_PART('hour', date_time)

However, how can I get the query to give an aggregation over 15 minutes?


Solution

  • You can use arithmetic on minutes:

    SELECT MINUTES_ADD(HOURS_ADD(TO_DATE(FROM_UNIXTIME(UNIX_TIMESTAMP(CAST(date_part AS STRING), "yyyyMMdd"))), hour_part), FLOOR(minute_part / 15)) AS date_time,
        AVG(kpi)
    FROM table
    GROUP BY date_part, hour_part,
             FLOOR(minute_part / 15)