Search code examples
mysqlsqlmariadbgreatest-n-per-group

Using a subquery to in WHERE clause but limiting to one match per sub query result


I'm using MariaDB.

I have two tables, REPLEN stores products sold, quantity and date sold, etc. and another is the main PRODUCT table with description, stock code, qty in stock, date last delivered, etc. STOCK_CODE is important as identical matches are shared by alternatives to that product.

I want to use this stock code to find alternatives to any entries in REPLEN which are now out of stock but have alternatives in stock. I want to limit this to one alternative per match AND have that alternative be the OLDEST delivered for stock rotation purposes.

The tables look like this:

REPLEN:

PROD_ID     QTY_SOLD   DATE_LAST_SOLD
4552        6          2020-10-28
8612        2          2020-11-14
7661        1          2020-11-15
9891        5          2020-11-17

This is tied to the PRODUCT table via the PROD_ID KEY which would like something like this

PRODUCT_ID   DESCRIPTION             STOCK_CODE     STOCK_QTY   DATE_LDELIV
4552         Cashew Nuts Best Buy    NUTS CASHEW       0         2020-11-01
8612         Baked Beans SaveMore    BEANS BUDGET      0         2020-08-12
7661         Nestle Instant Coffee   COFFEE INSTANT    40        2020-10-20
9891         Heinz Baked Beans       BEANS HEINZ       12        2020-10-09
10988        Baked Beans Supersaver  BEANS BUDGET      10        2020-11-04
11092        Baked Beans BestBuy     BEANS BUDGET      14        2020-10-27
12093        Cashew Nuts Supersaver  NUTS CASHEW       24        2020-11-18  

Now Products 4552 and 8612 are out of stock so there isn't any point telling someone to replenish them. However, I don't really care what brand budget beans or cashews are on the shelf so two other products also share the BEANS BUDGET stock_code and 1 other product shares NUTS CASHEW all of which ARE in stock.

I want a SELECT QUERY to find them!

So first I have a query that finds all STOCK_CODE for any product in REPLEN which aren't in stock

SELECT p.STOCK_CODE FROM product p JOIN replen r ON p.PRODUCT_ID=r.PROD_ID WHERE p.STOCK_QTY <=0

This returns "BEANS_BUDGET" and "NUTS CASHEW"

I can then use this as a subquery to find the products that are in stock and share the Stock_code

SELECT p.PRODUCT_ID, p.STOCK_CODE, p.DATE_LDELIV FROM product p 
WHERE p.STOCK_CODE IN 
  (SELECT p.STOCK_CODE FROM product p 
  JOIN replen r ON p.PRODUCT_ID=r.PROD_ID 
  WHERE p.STOCK_QTY <=0) 
AND p.STOCK_QTY > 0

But this returns ALL alternative products:

PRODUCT_ID     STOCK_CODE     DATE_LDELIV
10988          BEANS BUDGET   2020-11-04
11092          BEANS BUDGET   2020-10-27
12093          NUTS CASHEW    2020-11-18 

I only have space on the shelf for one row of Budget Beans!

This is where I'm stuck. I can add a GROUP BY:

SELECT p.PRODUCT_ID, p.STOCK_CODE, p.DATE_LDELIV FROM product p 
WHERE p.STOCK_CODE IN 
  (SELECT p.STOCK_CODE FROM product p 
  JOIN replen r ON p.PRODUCT_ID=r.PROD_ID 
  WHERE p.STOCK_QTY <=0) 
AND p.STOCK_QTY > 0 GROUP BY STOCK_CODE 

But this returns:

PRODUCT_ID     STOCK_CODE     DATE_LDELIV
10988          BEANS BUDGET   2020-11-04
12093          NUTS CASHEW    2020-11-18 

The trouble is product 11092 is older stock so I want that to be returned instead of 10988.

I can't use ORDER BY BEFORE my GROUP BY. Even if I could (by wrapping the query in another query using aliases) it always returns 10988 as MariaDB ignores any order in subquery results by design.

What is need is a query that will return only one alternative which is the OLDEST Delivered product. In essence, I want it to return:

PRODUCT_ID     STOCK_CODE     DATE_LDELIV
11092          BEANS BUDGET   2020-10-27
12093          NUTS CASHEW    2020-11-18 

Help!


Solution

  • Let's start with this query that almost works:

    SELECT p.PRODUCT_ID, p.STOCK_CODE, p.DATE_LDELIV
    FROM product p JOIN
    WHERE p.STOCK_CODE IN (SELECT p.STOCK_CODE
                           FROM product p JOIN
                               replen r
                               ON p.PRODUCT_ID = r.PROD_ID 
                           WHERE p.STOCK_QTY <= 0
                          ) AND
         p.STOCK_QTY > 0
    

    Next, I'll join in the replen table to get the oldest date:

    SELECT p.PRODUCT_ID, p.STOCK_CODE, p.DATE_LDELIV, r.date_last_sold,
           ROW_NUMBER() OVER (PARTITION BY p.PRODUCT_ID ORDER BY r.date_last_sold) as seqnum
    FROM product p JOIN
         replen r
         ON p.PRODUCT_ID = r.PROD_ID 
    WHERE p.STOCK_CODE IN (SELECT p.STOCK_CODE
                           FROM product p JOIN
                                replen r
                                ON p.PRODUCT_ID = r.PROD_ID 
                           WHERE p.STOCK_QTY <= 0
                          ) AND
         p.STOCK_QTY > 0;
    

    Then use this as a subquery to get the one row per product that you want:

    SELECT p.*
    FROM (SELECT p.PRODUCT_ID, p.STOCK_CODE, p.DATE_LDELIV, r.date_last_sold,
                 ROW_NUMBER() OVER (PARTITION BY p.PRODUCT_ID ORDER BY r.date_last_sold) as seqnum
          FROM product p JOIN
               replen r
               ON p.PRODUCT_ID = r.PROD_ID 
          WHERE p.STOCK_CODE IN (SELECT p.STOCK_CODE
                                 FROM product p JOIN
                                      replen r
                                      ON p.PRODUCT_ID = r.PROD_ID 
                                 WHERE p.STOCK_QTY <= 0
                                ) AND
               p.STOCK_QTY > 0
         ) p
    WHERE seqnum = 1;