I have the following tables in a Postgres 9.5 database:
product_days
Column | Type | Modifiers
-----------+---------+----------------------------------------------
id | integer | not null default nextval('product_days_id_seq'::regclass)
days_id | integer |
product_id | integer |
available | integer |
price | integer |
Indexes:
"pk_product_days" PRIMARY KEY, btree (id)
"idx_product_days" btree (days_id)
"idx_product_days_0" btree (product_id)
Foreign-key constraints:
"product_days_days_id_fkey" FOREIGN KEY (days_id) REFERENCES days(id)
product
Column | Type | Modifiers
----------------+-----------------------------+-----------------------------------------------------
id | integer | not null default nextval('product_id_seq'::regclass)
name | character varying(100) |
number_of_items | integer |
created_at | timestamp without time zone | default now()
updated_at | timestamp without time zone | default now()
total_number | integer |
Indexes:
"pk_product" PRIMARY KEY, btree (id)
product_days.product_id
is a foreign key referring to the product
table and available
represents the number of products available in each day.
I want to get all products that are available (available > 0
) at some specific days (days_id between 5 and 10
). They should be available at all of these days.
Currently I am trying to get the result with the following query but I am not sure whether it is correct or this is the most efficient way to do it:
select product.id as p_id, product.name as p_name, product.number_of_items as items
from product_days join product ON product_days.product_id = product.id
WHERE product_days.available > 0
AND prodcut_days.days_id between 5 and 10
group by product.id
HAVING count(*) > 5;
The output should be like:
p_id | p_name | items
-------+-----------+-------
1 | product_1 | 4
2 | product_2 | 13
I need the most efficient way of running this query either in SQL or in plpgsql.
After clarifying the typos, yes, the query should do what you describe.
This is equivalent and a bit faster:
SELECT id AS p_id, name AS p_name, number_of_items AS items
FROM (
SELECT product_id AS id
FROM product_days
WHERE available > 0
AND days_id BETWEEN 5 AND 10
GROUP BY 1
HAVING count(*) > 5
) d
JOIN product p USING (id);
There should be a UNIQUE
constraint enforcing max 1 entry per product and day:
ALTER TABLE product_days ADD CONSTRAINT your_name_here UNIQUE (product_id, days_id);
Related:
If you need to optimize performance for this particular query and there are more than a few non-qualifying rows (not available
or day_id
not matching), the sharpest weapon would be a partial index:
CREATE INDEX idx_name_here ON product_days (product_id)
WHERE available > 0
AND days_id BETWEEN 5 AND 10;
If you have autovacuum
running and the write load on the table isn't too big, you should see index-only scans on this index.
And while this index (or a more generic one) is used, this query should be faster, yet:
SELECT id AS p_id, name AS p_name, number_of_items AS items
FROM (
SELECT product_id AS id
FROM product_days d5
JOIN product_days d6 USING (product_id)
JOIN product_days d7 USING (product_id)
JOIN product_days d8 USING (product_id)
JOIN product_days d9 USING (product_id)
JOIN product_days d10 USING (product_id)
WHERE d5.days_id = 5 AND d5.available > 0
AND d6.days_id = 6 AND d6.available > 0
AND d7.days_id = 7 AND d7.available > 0
AND d8.days_id = 8 AND d8.available > 0
AND d9.days_id = 9 AND d9.available > 0
AND d10.days_id = 10 AND d10.available > 0
) d
JOIN product p USING (id);
Since this is a case of relational-division at its core. See: