sql sql-server self-join correlated-subquery

SQL Server - Correlated Subqueries / Self Join

A select statement returning the name, city, and state of each vendor that's located in a unique city and state (i.e., excludes vendors that have same city and state with another vendor)

SELECT 
    VendorName, VendorCity, VendorState
FROM 
    Vendors
WHERE 
    VendorState + VendorCity NOT IN (SELECT VendorState + VendorCity
                                     FROM Vendors
                                     GROUP BY VendorState + VendorCity
                                     HAVING COUNT(*) > 1)
ORDER BY 
    VendorState, VendorCity;

Alternate answer

SELECT 
    VendorName, VendorCity, VendorState
FROM 
    Vendors AS Vendors_Main
WHERE 
    VendorCity + VendorState NOT IN (SELECT VendorCity + VendorState
                                     FROM Vendors AS Vendors_Sub
                                     WHERE Vendors_Sub.VendorID <> Vendors_Main.VendorID)
ORDER BY 
    VendorState, VendorCity;

I understand the first answer, but not the alternate query. Point of confusion: wouldn't the line below return 0 rows since they're referencing the same table without an additional where clause?

WHERE Vendors_Sub.VendorID <> Vendors_Main.VendorID)

Solution

WHERE Vendors_Sub.VendorID <> Vendors_Main.VendorID) doesn't compare the same row of the same table.

A correlated subquery is logically executed once for each row in the outer query, in your case it checks for rows with the same VendorCity/ VendorState combination, but different VendorIDs.

In fact I would prefer a direct translation into a correlated NOT EXISTS:

SELECT VendorName, VendorCity, VendorState
FROM Vendors AS Vendors_Main
WHERE NOT EXISTS
 (
   SELECT *
   FROM Vendors AS Vendors_Sub
   WHERE Vendors_Sub.VendorCity  = Vendors_Main.VendorCity  -- same city
     AND Vendors_Sub.VendorState = Vendors_Main.VendorState -- same state
     AND Vendors_Sub.VendorID   <> Vendors_Main.VendorID    -- different vendor
 )
ORDER BY VendorState, VendorCity;

This prevents false positives like 'state' + 'acity' vs. 'statea' + 'city' which both concatenate to 'stateacity' and works for any kind of datatype.