Search code examples
mysqlsqlmysql-8.0

Determine customers that have spent money at the company for the first time, each month (mySQL)


I have a table below and I am trying to determine the number of users that have spent money at the company for the first time each month.

What I would like is to have a resulting table with new users, month and year as columns.

Before people downvote this post, I have looked through various posts already and can't seem to find a similar way to solve this issue. The code that I have included below is based on what I've managed to piece together from relevant posts.

This is the original table:

+---------------------+-------------+-----------------+
| datetime            | customer_id | amount          |
+---------------------+-------------+-----------------+
| 2018-03-01 03:00:00 | 3786        | 14              |
+---------------------+-------------+-----------------+
| 2018-03-02 17:00:00 | 5678        | 25              |
+---------------------+-------------+-----------------+
| 2018-08-17 19:00:00 | 5267        | 45              |
+---------------------+-------------+-----------------+
| 2018-08-25 08:00:00 | 3456        | 78              |
+---------------------+-------------+-----------------+
| 2018-08-25 17:00:00 | 3456        | 25              |
+---------------------+-------------+-----------------+
| 2019-05-25 14:00:00 | 3456        | 15              |
+---------------------+-------------+-----------------+
| 2019-07-02 14:00:00 | 88889       | 45              |
+---------------------+-------------+-----------------+
| 2019-08-25 08:00:00 | 1234        | 88              |
+---------------------+-------------+-----------------+
| 2019-08-30 09:31:00 | 1234        | 30              |
+---------------------+-------------+-----------------+
| 2019-08-30 12:00:00 | 9876        | 55              |
+---------------------+-------------+-----------------+
| 2019-09-01 13:00:00 | 88889       | 23              |
+---------------------+-------------+-----------------+

This is the CREATE statement:

CREATE TABLE IF NOT EXISTS `spend` ( `datetime` datetime NOT NULL, `customer_id` int(11) NOT NULL, `amount` int(11) NOT NULL, PRIMARY KEY (`datetime`)) DEFAULT CHARSET=utf8mb4;
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-03-01 03:00:00', 3786, 14);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-03-02 17:00:00', 5678, 25);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-17 19:00:00', 5267, 45);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-25 08:00:00', 3456, 78);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2018-08-25 17:00:00', 3456, 25);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-05-25 14:00:00', 3456, 15);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-07-02 14:00:00', 88889, 45);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-25 08:00:00', 1234, 88);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-30 09:31:00', 1234, 30);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-08-30 12:00:00', 9876, 55);
INSERT INTO `spend` (`datetime`, `customer_id`, `amount`) VALUES ('2019-09-01 13:00:00', 88889, 23);

This is the code that I have come up with:

SELECT S.datetime, S.customer_id, S.amount 
FROM spend S
INNER JOIN
    (SELECT customer_id, MIN(datetime) AS first_occurence
    FROM spend
    GROUP BY customer_id) X
ON S.customer_id = X.customer_id AND S.datetime = X.first_occurence

This is the resulting table:

+------------------+-------------+-------+
| datetime         | customer_id |amount |
+------------------+-------------+-------+
| 01/03/2018 03:00 | 3786        | 14    |
+------------------+-------------+-------+
| 02/03/2018 17:00 | 5678        | 25    |
+------------------+-------------+-------+
| 17/08/2018 19:00 | 5267        | 45    |
+------------------+-------------+-------+
| 25/08/2018 08:00 | 3456        | 78    |
+------------------+-------------+-------+
| 02/07/2019 14:00 | 88889       | 45    |
+------------------+-------------+-------+
| 25/08/2019 08:00 | 1234        | 88    |
+------------------+-------------+-------+
| 30/08/2019 12:00 | 9876        | 55    |
+------------------+-------------+-------+

This is what an example of what the table should look like:

+-----------+-------+------+
| new_users | month | year |
+-----------+-------+------+
| 2         | 3     | 2018 |
+-----------+-------+------+
| 3         | 8     | 2018 |
+-----------+-------+------+
| 1         | 5     | 2019 |
+-----------+-------+------+
| 1         | 7     | 2019 |
+-----------+-------+------+
| 3         | 8     | 2019 |
+-----------+-------+------+
| 1         | 9     | 2019 |
+-----------+-------+------+

Solution

  • You started correctly. Now use that as a subquery to get the count by month.

    SELECT COUNT(*) AS new_users, MONTH(datetime) AS month, YEAR(datetime) AS year
    FROM (
        SELECT S.datetime, S.customer_id, S.amount 
        FROM spend S
        INNER JOIN
            (SELECT customer_id, MIN(datetime) AS first_occurence
            FROM spend
            GROUP BY customer_id) X
        ON S.customer_id = X.customer_id AND S.datetime = X.first_occurence
    ) AS x
    GROUP BY month, year
    ORDER BY year, month
    

    Actually, you don't even need the join in the subquery, since you're not using the amount from the first purchase in the final result.

    SELECT COUNT(*) AS new_users, MONTH(datetime) AS month, YEAR(datetime) AS year
    FROM (
        SELECT customer_id, MIN(datetime) AS datetime
        FROM spend
        GROUP BY customer_id
    ) AS x
    GROUP BY month, year
    ORDER BY year, month