Search code examples
mysqlsqlmysql-workbench

Top selling product per user_id and categorize by each user's buying activity (first-time, second-time, etc.)


I'm working on MySQL workbench and I would like to find the top-selling products by types of buyers. First I categorized customers based on their buying activity. If a person made a purchase only on 1/1, it's a first-time buyer. If a person made a purchase on 1/1 and 1/2, it's a second-time buyer. If a person made a purchase on 1/1, 1/2, and 1/3, it's a third-time buyer. And so on.

The following is sample data, similar to my original data.

user_id PurchaseDate Product Total order_id
1 2023-01-01 16:46 yogurt 3.72 1000
1 2023-01-01 18:23 milk 4.38 null
2 2023-01-01 12:34 egg 4.57 1001
1 2023-01-01 11:54 butter 3.69 null
3 2023-01-01 10:41 cheese 3.68 1003
2 2023-01-01 13:45 milk 4.38 null
3 2023-01-03 18:56 yogurt 3.72 1005
3 2023-01-02 19:12 cream 4.98 1004
2 2023-01-02 12:47 egg 4.57 1002

If a customer makes more than one purchase on the same date, the order_id is null. So, I used that to categorize the buyers.

SELECT      user_id, product, sum(Total) as Total,
            CASE WHEN Count(DISTINCT(order_id)) = 1 THEN "FirstTimeBuyer" 
                 WHEN Count(DISTINCT(order_id)) = 2 THEN "SecondTimeBuyer" 
                 WHEN Count(DISTINCT(order_id)) = 3 THEN "ThirdTimeBuyer"
                 END AS TypeofBuyer
FROM    myData
GROUP BY user_id
ORDER BY Total DESC;

And I get the following result

user_id Product Total TypeofBuyer
2 egg 13.52 SecondTimeBuyer
3 cheese 12.38 ThirdTimeBuyer
1 yogurt 11.79 FirstTimeBuyer

But the problem is since I aggregated the total and grouped it by user_id, I cannot find the top-selling product for each user_id.

For instance, for user_id 1, the top-selling product is milk and the customer is a first-time buyer.

How can I extract the above information?

I'd appreciate your help!

Edit: Eventually, I'm aiming to visualize the top 10 selling products for 1st-time buyers, 2nd-time buyers, and so forth (my original data is much bigger than the sample)


Solution

  • Try this:

    /*first cte*/
    WITH cte AS (
      SELECT user_id, Product, Total,
           ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY Total DESC) Rnum,
           SUM(Total) OVER (PARTITION BY user_id) AS GTotal,
           order_id
    FROM myData),
    /*second cte*/
      cte2 AS (
    SELECT user_id,
           CASE WHEN Count(DISTINCT(order_id)) = 1 THEN "FirstTimeBuyer" 
                     WHEN Count(DISTINCT(order_id)) = 2 THEN "SecondTimeBuyer" 
                     WHEN Count(DISTINCT(order_id)) = 3 THEN "ThirdTimeBuyer"
                     END AS TypeofBuyer
     FROM cte
      GROUP BY user_id)
    /*join the ctes*/
    SELECT cte.user_id, cte.Product, cte.Total, cte.GTotal, cte2.TypeofBuyer
       FROM cte 
       JOIN cte2 ON cte.user_id=cte2.user_id
    WHERE Rnum=1
    ORDER BY GTotal DESC;
    

    Using ROW_NUMBER() to get top sales and assign with row numbering and SUM() OVER () to get total sales by user_id in the first cte. Then re-using your method of getting TypeofBuyer in the second cte. Lastly, a query to join both of the ctes and filter only rows with row number=1 to get your desired result.

    Demo fiddle