Search code examples
sqlsql-server

Clean up a SQL Server Query with too many DISTINCT clauses in it


In SQL Server, I have a join table Visit between the Customer and Store tables.

I am trying to find:

  • A list of CustomerID's
  • Who have visited exactly 1 store (StoreID)
  • Except I am only interested in Customers that have visited specific stores (1, 10, 16, 42)
  • Hence I want to reject a customer who has visited stores 1 and 2, since they have visited multiple stores
  • If a customer has visited the same store multiple times, I want to include them in the result. The Visit table has dates and other columns, hence there are multiple visits with the same customer going back to on multiple dates.

I think that the following query is correct - in that it appears to return the correct result:

SELECT DISTINCT 
    v.CustomerID
FROM
    Visit v
INNER JOIN 
    (SELECT CustomerID 
     FROM 
         (SELECT DISTINCT CustomerID, StoreID  
          FROM Visit) temp
     GROUP BY CustomerID
     HAVING COUNT(*) = 1) c ON v.CustomerID = c.CustomerID
WHERE 
    v.StoreID IN (1, 10, 16, 42)

However I want know if there is any way to simplify/clean it up as there are multiple DISTINCT and GROUP BY clauses in it.

Note: It's not ideal, but if necessary I can live with a query that returns customers who have visited any of desired Stores (1, 10, 16, 42) in any combination, any number of times as long as they haven't visited a store not in that list.


Solution

  • Something like this perhaps:

    SELECT v.CustomerID, MAX(v.StoreID) AS StoreID
    FROM Visit v
    GROUP BY CustomerID
    HAVING MAX(StoreId) = MIN(StoreId) -- Alternatively COUNT(DISTINCT v.StoreID) = 1
    AND MAX(v.StoreID) IN (1, 10, 16, 42)
    

    COUNT(DISTINCT StoreID) gives you customers with only one unique store

    Note: You can also use: MAX(StoreId) = MIN(StoreId) instead of COUNT(DISTINCT) because it has in some cases better performance characteristics.

    To get which store it is, I use MAX(x.StoreID). MAX or MIN doesn't matter here, as long as it's some kind of aggregation.