I'm currently using MS SQL Server 2016. I have two tables that are used to collect requests throughout the day to find parts that match either ALL or ANY of the requested part attributes:
CREATE TABLE requests (
r1_request_id int NOT NULL,
r1_match_type int NOT NULL,
CONSTRAINT PK_requests PRIMARY KEY CLUSTERED
(
r1_request_id ASC
)
);
CREATE TABLE request_attributes (
r2_request_id int NOT NULL,
r2_part_attribute int NOT NULL,
CONSTRAINT PK_request_attributes PRIMARY KEY CLUSTERED
(
r2_request_id ASC,
r2_part_attribute ASC
)
);
The r1_match_type column contains either a 1 or 2, where:
1 = ALL request_attributes must be present in a part
2 = ANY request_attributes may be present in a part
The third table is the parts table:
CREATE TABLE parts (
p1_part_id int NOT NULL,
p1_attribute int NOT NULL,
CONSTRAINT PK_parts PRIMARY KEY CLUSTERED
(
p1_part_id ASC,
p1_attribute ASC
)
);
At the end of the day a query is performed to find all requests and the parts that contain ANY or ALL attributes for that request. For example, given the following data:
Table: parts
p1_part_id p1_attribute
10 1
10 2
10 3
10 4
20 1
20 2
20 3
20 4
30 2
30 4
Table: requests
r1_request_id r1_match_type
1 1
2 2
Table: request_attributes
r2_request_id r2_part_attribute
1 1
1 2
1 3
1 4
2 2
2 4
Given the data above, request 1 has 4 part attributes; ALL of those attributes must be present in any parts. Request 2 has 2 part attributes; ANY of which may be present in a part.
The following query matches requests with parts, but does not make use of the flag (r1_match_type) used to indicate whether request part attributes must match ALL or ANY of the part attributes:
SELECT r1_request_id, p1_part_id, COUNT(*) AS attribute_count
FROM requests
JOIN request_attributes ON ( r2_request_id = r1_request_id )
JOIN parts ON ( p1_attribute = r2_part_attribute )
GROUP BY r1_request_id, p1_part_id
ORDER BY r1_request_id, p1_part_id
r1_request_id p1_part_id attribute_count
------------- ----------- ---------------
1 10 4
1 20 4
1 30 2
2 10 2
2 20 2
2 30 2
The correct output needs to take into account the r1_match_type flag and should produce the output shown below because request 1 has 4 part attributes and ALL 4 attributes must be present in a part (only parts 10 and 20 have all 4 attributes).
r1_request_id p1_part_id
------------- -----------
1 10
1 20
2 10
2 20
2 30
Typically, there are nearly 50,000 requests throughout the day and about 1.3 million parts in the DB.
Curious as to how others would tackle this particular problem? Using a single query, or two queries (one to find ALL, another to find ANY)?
I think conditional HAVING
should work for you. Check this, I think it can help you :
SELECT r2_request_id
, p1_part_id
--,MAX(r1_match_type) match_type
,COUNT(*) AS attribute_count
FROM requests r
JOIN request_attributes ra1 ON ( r.r1_request_id = ra1.r2_request_id )
JOIN parts p ON ( p.p1_attribute = ra1.r2_part_attribute )
GROUP BY r2_request_id
, p.p1_part_id
HAVING ( MAX(r.r1_match_type)=2
OR COUNT(1)=(SELECT COUNT(1)
FROM request_attributes ra2
WHERE ra1.r2_request_id=ra2.r2_request_id)
)
ORDER BY ra1.r2_request_id
, p.p1_part_id