Search code examples
mysqlsqlgreatest-n-per-groupcorrelated-subquery

Find value associated with min value of another col with a where clause


Say I have a table of prices;

+------------+----------+---+-----+----+-------+-----+
| product_id | price_id | a |  b  | c  | price | fee |
+------------+----------+---+-----+----+-------+-----+
|          1 |        1 | 1 | 100 | 10 |   500 |  60 |
|          1 |        2 | 1 | 100 | 20 |   505 |  50 |
|          1 |        3 | 1 | 200 | 10 |   510 |  30 |
|          1 |        4 | 1 | 200 | 20 |   515 |  25 |
|          1 |        5 | 1 | 300 | 10 |   520 |  15 |
|          1 |        6 | 1 | 300 | 20 |   525 |  50 |
|          1 |        7 | 2 | 100 | 10 |   530 |  40 |
|          1 |        8 | 2 | 100 | 20 |   535 |  35 |
|          1 |        9 | 2 | 200 | 10 |   540 |  60 |
+------------+----------+---+-----+----+-------+-----+

In reality this table would have hundreds of products and each of columns a, b, and c could take up around 10 values and there would be a price for every combination of these columns for each product.

I only want to display 1 price per product, so I have a GROUP BY on product_id.

Say I initially want to display the lowest price for each product, I can achieve this by SELECT min(price), no problem. Now, when I want to show the fee relating to the min price, I can't just show min(fee) because the prices and fees don't correlate and the minimum price does not necessarily have the lowest fee. So I join on a subquery, like so;

SELECT
    t.product_id,
    t.price_id,
    t.a,
    t.b, 
    t.c, 
    min(t.price) as `min_price`,
    t.fee,
    t2.fee AS `min_price_fee`
FROM
    prices as t
JOIN so_q as t2 on t.product_id = t2.product_id
    AND t.a = t2.a
    AND t.b = t2.b
    AND t.c = t2.c
    AND t2.price = (
        SELECT min(price)
        FROM so_q as t3 
        WHERE t3.product_id = t.product_id
--          AND t3.b = 300
    )
-- WHERE
--  t.b = 300
GROUP BY
    t.product_id;

But as you might have guessed from the lines I've commented out, my problem comes when users have added filters and there's now a where clause in play. I can't make this work without also putting the where clause in to the subquery, (if I don't I get no rows returned, which I think I understand) and my question is, is there a way I can do this so that I only have to have the where clause once?

Thanks for your advice - let me know if I should include any other info. Trying to distil an MCVE from the actual code I'm working with was complicated so I may have forgotten something obvious.

EDIT like MySQL version which is 5.5.56

EDIT 2

using @Gordon Linoff's suggestion;

SELECT
    p.* 
FROM
    prices p 
WHERE
    p.price = ( 
        SELECT min( p2.price )
        FROM prices p2
        WHERE p2.product_id = p.product_id
    )
AND b = 300;

I'm still getting 0 rows returned when I add the b = 300 condition to the where clause on the last line.

EDIT 3

To try and clarify what I'm trying to do: before any filters are added, for product 1, I want to display the minimum price (500) and the fee (60) from that record (price_id = 1). If a user adds a filter stipulating c = 20, then I want to display the minimum price which has a c value of 20 (505) and the fee (50) from that record (price_id = 2). I don't think I can use min(price) and min(fee) because I will end up with prices and fees from different records and they must be from the same record. So, I need to find the minimum price which satisfies all the user entered criteria (which end up as parts of the main where clause) and then find the fee associated with that price.


Solution

  • Taking @GordonLinoff's answer and extending the requirement to include minimising the the amount of code repetition, to make dynamic generation of the SQL simpler...

    Changing the correlated sub-query to return a row identifier instead of a minimum price has two consequences

    1. You only need to put the filter in the sub-query
    2. It will never return multiple rows in the event of a tie


    p.id = (SELECT id
              FROM prices p2
             WHERE p2.product_id = p.product_id
               AND <filters>
          ORDER BY price DESC,
                   some_tie_break_field(s)
             LIMIT 1
           )
    

    With such a structure, you would probably benefit from starting with a product table to minimise the work done by the correlated sub-query.

    SELECT
      prices.*
    FROM
      product
    INNER JOIN
      prices
          ON prices.id = (SELECT id
                            FROM prices p
                           WHERE p.product_id = product.id
                             AND <filters>
                        ORDER BY price DESC,
                                 some_tie_break_field(s)
                           LIMIT 1
                          )