I have some data on individuals in our database. I need to select each observation only once, but that selection needs to be of the earliest occurrence.
There are 2 identifiers for occurrence though, year, and quarter:
CustID | OrderYear | Quarter
0001 | 2012 | 3
0001 | 2012 | 4
0001 | 2013 | 1
0002 | 2012 | 3
0002 | 2012 | 4
0003 | 2013 | 1
I need the select statement to grab it so that the output is:
CustID | OrderYear | Quarter
0001 | 2012 | 3
0002 | 2012 | 3
0003 | 2013 | 1
So, observations are grouped by customer ID, then by Order Year, then by Quarter. Does this make sense? Without quarter I can get it just fine, but I need quarter in order to go the next step of the analysis and when I do that my technique fails.
Without quarter, I could just use First(orderYear)
in the select statement, group by CustID
, and it works just fine (the dataset is sorted by CustID and then Year, and then Quarter in ascending order, so in this case the First() grabs the earliest occurrence
SELECT CustID, First(OrderYear)
FROM Table1
GROUP BY CustID
The added layer of quarter is throwing me off though.
Thanks for your help!
Use a GROUP BY
to determine the minimum OrderYear for each CustID. Then join that query back to the main table to filter the dataset to only those combinations. Finally you can find the minimum Quarter for each OrderYear/CustID combination.
SELECT t.CustID, t.OrderYear, Min(t.Quarter)
FROM
Table1 AS t
INNER JOIN
(
SELECT CustID, Min(OrderYear) AS MinOfOrderYear
FROM Table1
GROUP BY CustID
) AS sub
ON
t.CustID = sub.CustID
AND t.OrderYear = sub.MinOfOrderYear
GROUP BY t.CustID, t.OrderYear;