I have a database that stores products "available on the market" and products "still in development"
in two separate tables (market_product
and dev_product
). A third table (substance) contains all
substances a product can made of. Other two tables (marked_product_comp
and dev_product_comp
)
mantains product compositions.
I want to select products still in development that are made of the same ingredients of marketed products.
In the following (simplified) example the query must select product with ID = 2 from dev_product table.
CREATE table market_product (ID SERIAL PRIMARY KEY);
CREATE table dev_product (ID SERIAL PRIMARY KEY);
CREATE table substance (ID SERIAL PRIMARY KEY);
CREATE table market_product_comp (prodID SERIAL, substID SERIAL, PRIMARY KEY(prodID,substID));
CREATE table dev_product_comp (devID SERIAL, substID SERIAL, PRIMARY KEY(devID,substID));
INSERT INTO market_product VALUES (1),(2);
INSERT INTO dev_product VALUES (1),(2);
INSERT INTO substance VALUES (1),(2),(3);
INSERT INTO market_product_comp VALUES (1,1),(1,2),(2,3);
INSERT INTO dev_product_comp VALUES (1,2),(2,1),(2,2);
How to write such query?
UPDATE:
Sorry, I haven't noticed I asked my question in an ambiguous way.
I want to select products still in development that have the same composition of at least one marketed product. For example, if there is a dev_product
made by substances {1,2} and only one market_product
made by substances {1,2,3}, I want to discard that dev_product
, because it has a different composition. I hope this clarify.
In MySQL
:
SELECT *
FROM dev_product dp
WHERE EXISTS
(
SELECT NULL
FROM market_product mp
WHERE NOT EXISTS
(
SELECT NULL
FROM dev_product_comp dpc
WHERE dpc.prodID = dp.id
AND NOT EXISTS
(
SELECT NULL
FROM market_product_comp mpc
WHERE mpc.prodID = mp.id
AND mpc.substID = dpc.substID
)
)
AND NOT EXISTS
(
SELECT NULL
FROM market_product_comp mpc
WHERE mpc.prodID = mp.id
AND NOT EXISTS
(
SELECT NULL
FROM dev_product_comp dpc
WHERE dpc.prodID = dp.id
AND dpc.substID = mpc.substID
)
)
)
In PostgreSQL
:
SELECT *
FROM dev_product dp
WHERE EXISTS
(
SELECT NULL
FROM market_product mp
WHERE NOT EXISTS
(
SELECT NULL
FROM (
SELECT substID
FROM market_product_comp mpc
WHERE mpc.prodID = mp.ID
) m
FULL OUTER JOIN
(
SELECT substID
FROM dev_product_comp dpc
WHERE dpc.devID = dp.ID
) d
ON d.substID = m.substID
WHERE d.substID IS NULL OR m.substID IS NULL
)
)
Neither from these queries uses COUNT(*)
: it's enough to find but a single non-matching component to stop evaluating the whole pair.
See these entries in my blog for explanations:
PostgreSQL
, with FULL OUTER JOIN
)MySQL
, with EXISTS
)