Search code examples
mysqlsqlleft-joingreatest-n-per-group

MySQL Join table row based on lowest cell value


I have two tables in a MySQL database like this:

PRODUCT:

product_id | product_name
-----------+-------------
1          |  shirt
2          |  pants
3          |  socks

PRODUCT_SUPPLIER: (id is primary key)

id  |  supplier_id  |  product_id  |  part_no  |  cost
----+---------------+--------------+-----------+--------
1   |  1            |  1           | s1p1      | 5.00
2   |  1            |  2           | s1p2      | 15.00
3   |  1            |  3           | s1p3      | 25.00
4   |  2            |  1           | s2p1      | 50.00
5   |  2            |  2           | s2p2      | 10.00
6   |  2            |  3           | s2p3      | 5.00

My goal is a query that joins the tables and outputs a single row for each product joined with all fields from the corresponding supplier row with the lowest cost like this:

product_id | product_name  | supplier_id   |  part_no   | cost
-----------+---------------+---------------+------------+---------
1          | shirt         | 1             |  s1p1      | 5.00
2          | pants         | 2             |  s2p2      | 10.00
3          | socks         | 2             |  s3p3      | 5.00

At present I do have the following query written which seems to work but I'd like to know from any of the more experienced SQL users if there is a cleaner, more efficient or otherwise better solution? Or if there is anything essentially wrong with the code I have?

SELECT p.product_id, p.product_name, s. supplier_id, s.part_no, s.cost
FROM product p
LEFT JOIN product_supplier s ON
   (s.id = (SELECT s2.id 
            FROM product_supplier s2
            WHERE s2.product_id = p.product_id
            ORDER BY s2.cost LIMIT 1));

Solution

  • I would run:

    select p.product_id, p.product_name, s.supplier_id, s.part_no, s.cost
      from product p
      join product_supplier s
        on p.product_id = s.product_id
      join (select product_id, min(cost) as min_cost
              from product_supplier
             group by product_id) v
        on s.product_id = v.product_id
       and s.cost = v.min_cost
    

    I don't see the point in an outer join. Is every product is on the product_supplier table? If not then the outer join makes sense (change the join to inline view aliased as v above to a left join if that is the case).

    The above may run a little faster than your query because the subquery is not running for each row. Your current subquery is dependent and relative to each row of product.

    If you want to eliminate ties and don't care about doing so arbitrarily you can add a random number to the end of the results, put the query into an inline view, and then select the lowest/highest/etc. random number for each group. Here is an example:

    select product_id, product_name, supplier_id, part_no, cost, min(rnd)
      from (select p.product_id,
                   p.product_name,
                   s.supplier_id,
                   s.part_no,
                   s.cost,
                   rand() as rnd
              from product p
              join product_supplier s
                on p.product_id = s.product_id
              join (select product_id, min(cost) as min_cost
                     from product_supplier
                    group by product_id) v
                on s.product_id = v.product_id
               and s.cost = v.min_cost) x
     group by product_id, product_name, supplier_id, part_no, cost
    

    If for some reason you don't want the random # to come back in output, you can put the whole query above into an inline view, and select all columns but the random # from it.