Search code examples
sqlpostgresqlintersectionpostgresql-performance

Intersection of SQL statements efficiently


I'm trying to solve a SQL problem where I need to select the data in a certain intersection. This is more or less my query:

SELECT id FROM some_table WHERE
id IN (SELECT id FROM other_table WHERE _id=1 AND some_value >= 1 AND some_value <= 10) AND
id IN (SELECT id FROM another_table WHERE _id=2 AND some_other_value >= -65 AND some_other_value <= -2)

The problem with that query is that it does not return the intersection because the _id field is different both subqueries. So that query will return an empty result always.

If I use OR instead of AND to "intersect" the subqueries, then the intersection is also not returned.

I also do not want to use INTERSECT because that's kind of slow and this is just a very reduced example of a very large query, so speed is really important.

So, my question is, is there any way to do this intersection as fast as possible? Have in mind that even though in the example only appears two subqueries, in my real use case the number of subqueries can be larger.


Solution

  • I'm still unclear on what you're after w/o sample data or expected results, but would either of the following options work?

    using INNER JOIN

    SELECT id 
      FROM some_table A
     INNER JOIN (
           SELECT id
             FROM other_table
            WHERE _id = 1
              AND some_value >= 1
              AND some_value <= 10
     ) B on A.ID = B.ID
     INNER JOIN (
           SELECT id
             FROM another_table
            WHERE _id = 2
              AND some_other_value >= -65
              AND some_other_value <= -2
     ) C ON C.ID = A.ID
    

    using INTERSECT

    SELECT id
      FROM some_table
    INTERSECT
    SELECT id
      FROM other_table
     WHERE _id = 1
       AND some_value >= 1
       AND some_value <= 10
    INTERSECT 
    SELECT id
      FROM another_table
     WHERE _id = 2
       AND some_other_value >= -65
       AND some_other_value <= -2
    
    
    Then, I suppose we could use `EXISTS` and correlated subquery—instead of `IN`. This can be faster, as `EXISTS` can escape early whereas `IN` cannot.