Search code examples
sqlsql-serversubquery

SQL: Nested subquery is returning entries incorrectly


I'm having some trouble querying a dataset with a nested subquery, which I thought would be pretty straightforward.

I have a table of customers and their addresses dbo.PERSON_ADDRESSES and transactions with customers dbo.TRANSACT_CUSTOMERS. It is very common for customers to have multiple addresses stored in the dbo.PERSON_ADDRESSES table over time. I simply would like to use the most recent transaction in the dbo.TRANSACT_CUSOMTERS table to a table of most recent addresses from the dbo.PERSON_ADDRESSES table.

When I run the inner subquery independently, it works fine: it shows the one most recent transaction per customer like I envisioned. But, for some reason when I run this entire query, I obtain many, many addresses per customer. I don't understand why.

SELECT MaxTransaction.PERSON_ID, Addr.*
FROM dbo.PERSON_ADDRESSES AS Addr
INNER JOIN
   (SELECT PERSON_ID, Max(TRANSACTION_ID) AS MaxTID
    FROM dbo.TRANSACTION_CUSTOMERS 
    GROUP BY PERSON_ID) AS MaxTransaction
ON MaxTransaction.MaxTID = Addr.TRANSACTION_ID

Solution

  • I am guessing that one transaction can have multiple customers. To get one row per person, use an additional JOIN condition:

    SELECT maxp.PERSON_ID, pa.*
    FROM dbo.PERSON_ADDRESSES pa JOIN
         (SELECT PERSON_ID, Max(TRANSACTION_ID) AS MaxTID
          FROM dbo.TRANSACTION_CUSTOMERS 
          GROUP BY PERSON_ID
         ) maxp
         ON maxp.person_id = pa.person_id AND
            maxp.MaxTID = pa.TRANSACTION_ID;