I'm trying to find total books sold (sum of quantities) for each product_id
and txn_day
. I have 2 tables, transactions
and catalog
as below:
Table 1: transaction
market_id, txn_day, customer_id, product_id, quantity
1,2019-03-01,1,B0002,1
1,2019-03-01,2,B0003,1
1,2019-03-01,1,B0001,1
3,2019-03-01,3,B0001,1
3,2019-03-01,4,B0002,1
4,2019-03-01,1,B0002,1
4,2019-03-01,5,B0001,1
4,2019-03-01,6,B0001,1
Table 2: catalog
market_id, product_id, title_name
1,B0001, Harry Potter 1
1,B0002, Harry Potter 2
1,B0003, Harry Potter 3
3,B0001, Harry Potter 1
3,B0002, Harry Potter 2
3,B0003, Harry Potter 3
4,B0001, Harry Potter 1
4,B0002, Harry Potter 2
4,B0003, Harry Potter 3
I wrote the following query and got the total books sold (sum of quantities) for product_id
:
SELECT
transaction.txn_day, transaction.product_id,
SUM(quantity) AS quantity
FROM
transaction
GROUP BY
transaction.product_id, transaction.txn_day;
I tried following query to get title_name
for each product_id
, but it seems incorrect.
SELECT
transaction.txn_day, transaction.product_id, catalog.title_name,
SUM(quantity) AS quantity
FROM
catalog
INNER JOIN
transaction ON catalog.product_id = transaction.product_id
GROUP BY
transaction.txn_day, transaction.product_id, catalog.title_name;
I'm getting the following result:
|txn_day |product_id |title_name |quantity
|2019-03-01 |B0002 |Harry Potter 2 |9
|2019-03-01 |B0001 |Harry Potter 1 |12
|2019-03-01 |B0003 |Harry Potter 3 |3
I'm expecting the result to be something like:
|txn_day | product_id | quantity | title_name
|2019-03-01 | B0003 | 1 | Harry Potter 3
|2019-03-01 | B0002 | 3 | Harry Potter 2
|2019-03-01 | B0001 | 4 | Harry Potter 1
Please suggest any changes required to the query.
I loaded your data and got the following results:
DATE ID TITLE QTY
3/1/2019 B0002 Harry Potter 2 9
3/1/2019 B0003 Harry Potter 3 3
3/1/2019 B0001 Harry Potter 1 12
Your clue here as to what is wrong is that your Qty is 3x too much for each record.
If you look at your INNER JOIN you are only joining by the product_id.
Take a look at your transaction rows.
You can see that each product_id exists along with 3 different market_ids, that is why your qty is tripling.
Solution: Add the market_id to the JOIN:
SELECT t.txn_day
,t.product_id
,c.title_name
,SUM(quantity) AS quantity
FROM catalog c
INNER JOIN transactions t ON c.product_id = t.product_id AND c.market_id = t.market_id
GROUP BY t.txn_day, t.product_id, c.title_name
order by c.title_name;
You will get your results:
3/1/2019 B0001 Harry Potter 1 4
3/1/2019 B0002 Harry Potter 2 3
3/1/2019 B0003 Harry Potter 3 1