Search code examples
sqloracle-databasegreatest-n-per-group

SQL get products and messages group by


I need to write a Oracle SQL query. I have two tables products and messages. Product table looks like this:

product_id creation_date user_id category_id
p1 2017-03-01 u1 c1
p2 2018-05-23 u1 c3
p3 2019-06-21 u2 c1

messages table looks like this:

message_id creation_date product_id user_from
m1 2018-03-01 p1 u2
m2 2019-08-19 p1 u5
m3 2020-10-10 p3 u7

I want to list all the products within a category, sorted by their total number of messages, together with their top 5 buyers of each product (users who contacted those products sorted by their total number of messages sent)

Sample output table:

category_id product_id total_messages_for_product user_id messages
c1 p1 200 u1 10
c1 p1 200 u2 9
c1 p1 200 u3 7
c1 p1 200 u4 5
c1 p1 200 u5 4
c1 p2 150 u7 11
c1 p2 150 u8 10
c1 p2 150 u9 9
c1 p2 150 u10 7
c1 p2 150 u4 6

Solution

  • You appear to want:

    SELECT p.category_id,
           p.product_id,
           m.total_messages_for_product,
           m.user_from AS user_id,
           m.messages
    FROM   products p
           INNER JOIN (
             SELECT product_id,
                    user_from,
                    COUNT(*) AS messages,
                    SUM( COUNT(*) ) OVER ( PARTITION BY product_id )
                      AS total_messages_for_product,
                    RANK() OVER (
                      PARTITION BY product_id ORDER BY COUNT(*) DESC
                    ) AS messages_rank
             FROM   messages
             GROUP BY product_id, user_from
           ) m
           ON ( p.product_id = m.product_id )
    WHERE  m.messages_rank <= 5;
    

    (Note: you could, instead, use ROW_NUMBER to get the top 5 entries without ties rather than RANK which returns the top 5 entries with ties.)

    Which, for your sample data:

    CREATE TABLE products ( product_id, creation_date, user_id, category_id ) AS
    SELECT 'p1', DATE '2017-03-01', 'u1', 'c1' FROM DUAL UNION ALL
    SELECT 'p2', DATE '2018-05-23', 'u1', 'c3' FROM DUAL UNION ALL
    SELECT 'p3', DATE '2019-06-21', 'u2', 'c1' FROM DUAL;
    
    CREATE TABLE messages( message_id, creation_date, product_id, user_from ) AS
    SELECT 'm1', DATE '2018-03-01', 'p1', 'u2' FROM DUAL UNION ALL
    SELECT 'm2', DATE '2019-08-19', 'p1', 'u5' FROM DUAL UNION ALL
    SELECT 'm3', DATE '2020-10-10', 'p3', 'u7' FROM DUAL;
    

    Outputs:

    CATEGORY_ID | PRODUCT_ID | TOTAL_MESSAGES_FOR_PRODUCT | USER_ID | MESSAGES
    :---------- | :--------- | -------------------------: | :------ | -------:
    c1          | p1         |                          2 | u5      |        1
    c1          | p1         |                          2 | u2      |        1
    c1          | p3         |                          1 | u7      |        1
    

    db<>fiddle here