Search code examples
sqlsql-serverjoinmappingrevision-history

SQL Join on table with revision dates


I am about to begin working on a solution for my company in SQL that requires me to link product category mappings based on a SKU number. The problem is that the category mappings can change, and they will have a date associated with the change.

For example:

  • Product 1 is currently mapped to category A.
  • User realizes it was mapped in error and changes the Product 1 mapping to Category B for all transactions dated after 1/1/2018.
  • Another user decides to map Product 1 to Category C, but only for transactions after 2/1/2019. And so on...

Ultimately, I need to assign a product's sales to the category it was in at the time the sale occured.

I figured I could do a Join like the below:

SELECT 
   A.TransactionDate,
   A.Product,  
   B.Category, 
   SUM(A.Sales) AS TotalSales 
FROM Products AS A 
JOIN CategoryMappings AS B ON 
   A.Product=B.Product AND 
   A.TransactionDate>=B.RevisionDate 
GROUP BY A.TransactionDate, A.Product, B.Category

This assumes I am getting each month's sales and category mapping, and each month's transactions are posted to a month end date ('1/31/2018','4/30/2019', etc.). Unfortunately, this Join would only work for the newest transactions if there is only one mapping change, but what if there were three or more as in the example? What if I wanted to see sales in 2018 based on the 2018 mapping specifically since it is sandwiched between two other mappings?

I've used Stack Overflow before, but this is my first question, so forgive me if it is missing information or not properly formatted.

Thank you for any help you can give!


Solution

  • Not sure without being able to verify against sample data.

    But I assume a NOT EXISTS could limit to the nearest revision.

    SELECT 
       P.TransactionDate,
       P.Product,  
       CatMap1.Category, 
       SUM(P.Sales) AS TotalSales 
    FROM Products AS P
    JOIN CategoryMappings AS CatMap1 
      ON CatMap1.Product = P.Product 
     AND CatMap1.RevisionDate <= P.TransactionDate
    
    WHERE NOT EXISTS
    (
        SELECT 1
        FROM CategoryMappings AS CatMap2
        WHERE CatMap2.Product = P.Product
          AND CatMap2.RevisionDate <= P.TransactionDate
          AND CatMap2.RevisionDate > CatMap1.RevisionDate
    )
    
    GROUP BY P.TransactionDate, P.Product, CatMap1.Category
    

    A CROSS APPLY might also work

    SELECT 
       P.TransactionDate,
       P.Product,  
       CatMap.Category, 
       SUM(P.Sales) AS TotalSales 
    FROM Products AS P
    CROSS APPLY
    (
        SELECT TOP 1 CM.Category
        FROM CategoryMappings AS CM
        WHERE CM.Product = P.Product 
          AND CM.RevisionDate <= P.TransactionDate
        ORDER BY CM.RevisionDate DESC
    ) CatMap
    GROUP BY P.TransactionDate, P.Product, CatMap.Category