Search code examples
sqlpostgresqlcountaggregate-functionsgreatest-n-per-group

Find maximum value from specific column and group by them after that


I have a table like this:

p_id | store |      createdat      | device  | deviceserial | application
------+-------+---------------------+---------+--------------+-------------
      | z10   | 2020-09-02 08:02:39 | android | 636363636890 | app-a
      | z10   | 2020-09-02 08:08:18 | Android | 636363636890 | app-a
      | z10   | 2020-09-02 08:10:10 | Android | 636363636890 | app-a
      | z10   | 2020-09-02 08:20:10 | Android | 636363636890 | app-a
      | z10   | 2020-09-02 10:40:11 | IOS     | 6625839827   | app-b
      | z10   | 2020-09-02 10:45:11 | IOS     | 6625839827   | app-b
      | z10   | 2020-09-02 10:50:11 | IOS     | 6625839827   | app-b
      | z11   | 2020-09-02 08:47:10 | Android | 636363636891 | app-a
      | z11   | 2020-09-02 08:55:10 | Android | 636363636891 | app-a
      | z11   | 2020-09-02 08:59:10 | Android | 636363636891 | app-a
      | z11   | 2020-09-02 13:01:11 | IOS     | 6625839828   | app-b
      | z11   | 2020-09-02 13:15:11 | IOS     | 6625839828   | app-b
      | z10   | 2020-09-02 12:03:10 | Android | 636363636890 | app-a
      | z10   | 2020-09-02 12:09:10 | Android | 636363636890 | app-a
      | z10   | 2020-09-02 12:12:10 | Android | 636363636890 | app-a
      | z10   | 2020-09-02 15:15:11 | IOS     | 6625839827   | app-b
      | z10   | 2020-09-02 15:20:11 | IOS     | 6625839827   | app-b
      | z11   | 2020-09-02 10:25:10 | Android | 636363636891 | app-a
      | z11   | 2020-09-02 10:35:10 | Android | 636363636891 | app-a
      | z11   | 2020-09-02 16:39:11 | IOS     | 6625839828   | app-b

I want to group by this records by hourly.Then find the max events(event_counts) value of app-a and app-b from store z-10 and z-11 in one hour. In real there will be more stores and applications.I tried with this query:

select distinct on (application) 
    application,
    count(*) as event_count,
    date_trunc('hour', createdat) as hr,store
from devices  
group by application,hr, store
order by application, event_count desc

Thats my result:

application  | event_count |         hr             store
--------------+-------------+---------------------+---------------------
 app-a       |           4 | 2020-09-02 08:00:00 |z10
 app-b       |           3 | 2020-09-02 10:00:00 |z10

I want to achieve this result:

application  | event_count |         hr             store
--------------+-------------+---------------------+---------------------
 app-a       |           4 | 2020-09-02 08:00:00 |z10
 app-b       |           3 | 2020-09-02 10:00:00 |z10
 app-a       |           3 | 2020-09-02 08:00:00 |z11
 app-b       |           2 | 2020-09-02 13:00:00 |z11
 

Solution

  • If you want the peak hour for each application/store tuple, you use distinct on like this:

    select distinct on (application, store) 
        application, store,
        count(*) as event_count,
        date_trunc('hour', createdat) as hr
    from mytable  
    group by application, store, hr
    order by application, store, event_count desc
    

    Demo on DB Fiddle:

    application | store | event_count | hr                 
    :---------- | :---- | ----------: | :------------------
    app-a       | z10   |           4 | 2020-09-02 08:00:00
    app-a       | z11   |           3 | 2020-09-02 08:00:00
    app-b       | z10   |           3 | 2020-09-02 10:00:00
    app-b       | z11   |           2 | 2020-09-02 13:00:00