Search code examples
mysqlimpala

Impala/SQL: Can I have different time-period for each group?


I have the following table:

id   | timestamp  |  team  
----------------------------
1    | 2016-05-06 |  A
2    | 2016-03-02 |  A
3    | 2015-12-01 |  A
4    | 2016-07-05 |  B
5    | 2016-06-30 |  B
6    | 2016-06-28 |  B
7    | 2016-04-05 |  C
8    | 2016-04-02 |  C
9    | 2016-01-02 |  C

I want to group by team and find the last timestamp for each team, so I did:

select team, max(timestamp) from my_table group by team

It's all working fine so far. However, now I want to find out how many distinct id in the last month of each team. For example, for team A, it would be from 2016-04-07 to 2016-05-06, so such count is 1. For team B, the last month is from 2016-06-06 to 2016-07-05, so the count is 3. And for team C, the last month is from 2016-03-06 to 2016-04-05, and the count is 2. My expected output should look like:

team  | max(timestamp)  |  count_in_last_month
------------------------------------------------
A     | 2016-05-06      |  1
B     | 2016-07-05      |  3
C     | 2016-04-05      |  2

Can this be derived using the Impala query? Thanks!


Solution

  • Join the original table with the subquery that gets the max timestamp.

    SELECT t1.team, t2.month_end, COUNT(DISTINCT t1.id) AS count_in_last_month
    FROM my_table AS t1
    JOIN (SELECT team, MAX(timestamp) AS month_end
          FROM my_table
          GROUP BY team) AS t2
    ON t1.team = t2.team 
    AND t1.timestamp BETWEEN DATE_SUB(month_end, INTERVAL 1 MONTH) AND month_end
    GROUP BY t1.team, t2.month_end
    

    DEMO