I'm using SQL Server 2008 R2. I have table called Name with the following structure and sample data:
ID Date Name DOD
10001 200911 Kevin H 06/17/2000
10001 200912 Kevin 06/20/2000
10001 201001 Kevin 06/20/2000
10001 201012 K 06/20/2000
10001 201101 K 06/20/2000
10001 201406 Kevin 06/20/2000
Notice that the Employee 10001 has been changing names 3 times and DODs once over time. What I am trying to do is to group by ID, Name, and DOD such that the data is consistent between dates. I also need to grab the min and max date for these groups and ensure that the dates are in sequential order. If the name or DOD changes and then changes back what it was previously, a new group would need to be created.
So, the output will look like this:
EmployeeID MinDate MaxDate Name DOD
10001 200911 200911 Kevin H 06/17/2000
10001 200912 201001 Kevin 06/20/2000
10001 201012 201101 K 06/20/2000
10001 201406 201406 Kevin 06/20/2000
The Name table is quite large so there will be instances where Data is consistent for 20 months, then inconsistent for 1 month, and then back to being consistent for 20 months.
Thank you in advance and please let me know if you need additional information.
You can use the difference of row numbers approach:
select employeeid, min(date) as mindate, max(date) as maxdate,
name, dod
from (select t.*,
row_number() over (partition by employeeid order by date) as seqnum_d,
row_number() over (partition by employeeid, name, dod order by date) as seqnum_dnd
from t
) t
group by employeeid, (seqnum_d - seqnum_dnd);
To see how this works, run the subquery. If you stare at the results for a while, you'll get why the difference of row numbers works in this case.