Search code examples
sqlsql-server-2008query-performance

SQL query to match ALL and match ANY


I'm currently using MS SQL Server 2016. I have two tables that are used to collect requests throughout the day to find parts that match either ALL or ANY of the requested part attributes:

CREATE TABLE requests (
    r1_request_id int NOT NULL,
    r1_match_type int NOT NULL,
    CONSTRAINT PK_requests PRIMARY KEY CLUSTERED 
    (
        r1_request_id ASC
    )
);

CREATE TABLE request_attributes (
    r2_request_id int NOT NULL,
    r2_part_attribute int NOT NULL,
    CONSTRAINT PK_request_attributes PRIMARY KEY CLUSTERED 
    (
        r2_request_id ASC,
        r2_part_attribute ASC
    )
);

The r1_match_type column contains either a 1 or 2, where:

1 = ALL request_attributes must be present in a part

2 = ANY request_attributes may be present in a part

The third table is the parts table:

CREATE TABLE parts (
    p1_part_id int NOT NULL,
    p1_attribute int NOT NULL,
    CONSTRAINT PK_parts PRIMARY KEY CLUSTERED 
    (
        p1_part_id ASC,
        p1_attribute ASC
    )
);

At the end of the day a query is performed to find all requests and the parts that contain ANY or ALL attributes for that request. For example, given the following data:

Table: parts

p1_part_id  p1_attribute
10          1
10          2
10          3
10          4
20          1
20          2
20          3
20          4
30          2
30          4


Table: requests

r1_request_id   r1_match_type
1               1
2               2


Table: request_attributes

r2_request_id   r2_part_attribute
1               1
1               2
1               3
1               4
2               2
2               4

Given the data above, request 1 has 4 part attributes; ALL of those attributes must be present in any parts. Request 2 has 2 part attributes; ANY of which may be present in a part.

The following query matches requests with parts, but does not make use of the flag (r1_match_type) used to indicate whether request part attributes must match ALL or ANY of the part attributes:

SELECT r1_request_id, p1_part_id, COUNT(*) AS attribute_count
  FROM requests
  JOIN request_attributes ON ( r2_request_id = r1_request_id )
  JOIN parts ON ( p1_attribute = r2_part_attribute )
GROUP BY r1_request_id, p1_part_id
ORDER BY r1_request_id, p1_part_id

r1_request_id p1_part_id  attribute_count
------------- ----------- ---------------
1             10          4
1             20          4
1             30          2
2             10          2
2             20          2
2             30          2

The correct output needs to take into account the r1_match_type flag and should produce the output shown below because request 1 has 4 part attributes and ALL 4 attributes must be present in a part (only parts 10 and 20 have all 4 attributes).

r1_request_id p1_part_id
------------- -----------
1             10
1             20
2             10
2             20
2             30

Typically, there are nearly 50,000 requests throughout the day and about 1.3 million parts in the DB.

Curious as to how others would tackle this particular problem? Using a single query, or two queries (one to find ALL, another to find ANY)?


Solution

  • I think conditional HAVING should work for you. Check this, I think it can help you :

    SELECT    r2_request_id
            , p1_part_id
            --,MAX(r1_match_type) match_type
            ,COUNT(*) AS attribute_count
      FROM requests r
      JOIN request_attributes ra1 ON ( r.r1_request_id = ra1.r2_request_id )
      JOIN parts p ON ( p.p1_attribute = ra1.r2_part_attribute )
    GROUP BY  r2_request_id
            , p.p1_part_id
    HAVING (    MAX(r.r1_match_type)=2 
            OR  COUNT(1)=(SELECT COUNT(1) 
                          FROM request_attributes ra2 
                          WHERE ra1.r2_request_id=ra2.r2_request_id)
            )
    ORDER BY  ra1.r2_request_id
            , p.p1_part_id