Search code examples
sql-servert-sqlquery-optimization

set difference in SQL query


I'm trying to select records with a statement

SELECT * 
FROM A 
WHERE 
  LEFT(B, 5) IN 
    (SELECT * FROM 
       (SELECT LEFT(A.B,5), COUNT(DISTINCT A.C) c_count 
        FROM A 
        GROUP BY LEFT(B,5)
       ) p1 
       WHERE p1.c_count = 1
     ) 
     AND C IN 
        (SELECT * FROM 
            (SELECT A.C , COUNT(DISTINCT LEFT(A.B,5)) b_count 
             FROM A 
             GROUP BY C
            ) p2 
          WHERE p2.b_count = 1)

which takes a long time to run ~15 sec.

Is there a better way of writing this SQL?


Solution

  • You don't need to return data from the nested subqueries. I'm not sure this will make a difference withiut indexing but it's easier to read.

    And EXISTS/JOIN is probably nicer IMHO then using IN

    SELECT * 
    FROM
        A 
        JOIN
        (SELECT LEFT(B,5) AS b1
            FROM A 
            GROUP BY LEFT(B,5)
            HAVING COUNT(DISTINCT C) = 1
        ) t1 On LEFT(A.B, 5) = t1.b1
        JOIN
        (SELECT C AS C1
            FROM A 
            GROUP BY C
            HAVING COUNT(DISTINCT LEFT(B,5)) = 1
        ) t2 ON A.C = t2.c1
    

    But you'll need a computed column as marc_s said at least

    And 2 indexes: one on (computed, C) and another on (C, computed)