I have a table, #TEST, that lists a referenced Person Value (2), a Year for that person (1, 2, 3, 4, 5) and a Dvalue for each year (100,100,150,100,100).
Year (int), DValue (Numeric(18,2)), Person (int)
I am attempting to get this to a final output of the person value, start year, end year, and value.
Person = 2, startyear = 1, endyear = 2, value = 100
Person = 2, startyear = 3, endyear = 3, value = 150
Person = 2, startyear = 4, endyear = 5, value = 100
SELECT 1 YEAR, 100 DVALUE, 2 PERSON
INTO #TEST
UNION ALL
SELECT 2 YEAR, 100 DVALUE, 2 PERSON
UNION ALL
SELECT 3 YEAR, 150 DVALUE, 2 PERSON
UNION ALL
SELECT 4 YEAR, 100 DVALUE, 2 PERSON
UNION ALL
SELECT 5 YEAR, 100 DVALUE, 2 PERSON
The above code will populate the data in the table. I have attempted a few ordering options already which are included below with current output.
SELECT *
,DDENSERANK = DENSE_RANK () OVER (PARTITION BY PERSON , DVALUE ORDER BY YEAR)
,DLAG = CASE WHEN LAG(DVALUE) OVER (PARTITION BY PERSON ORDER BY YEAR) IS NULL THEN 2 ELSE CASE WHEN A.DAMT = LAG(DVALUE) OVER (PARTITION BY PERSON ORDER BY YEAR) THEN 2 ELSE 1 END END
,DROWNUM = ROW_NUMBER() OVER (PARTITION BY PERSON ORDER BY YEAR)
INTO #TESTFINAL
FROM #TEST A
ORDER BY YEAR
SELECT DISTINCT *
,DDENSERANKA = DENSE_RANK () OVER (PARTITION BY PERSON ,DLAG ,DVALUE ORDER BY YEAR)
,ROW_NUMBER() OVER (PARTITION BY PERSON , DLAG ORDER BY A.SYEAR)
FROM #TESTFINAL A
My Dense_Rank continues numbering after a break in Year 3. With Lag I run into issues with determining if the end year. How can I number these differently or select from these so that I could see the start year and stop year of each DVALUE change by PERSON with an output like the following?
PERSON, DVALUE, STARTYEAR, ENDYEAR
2, 100, 1, 2
2, 150, 3, 3
2, 100, 4, 5
Current Output shows:
SELECT
PERSON
, DVALUE
, MIN(YEAR)
, MAX(YEAR)
FROM #TESTFINAL
GROUP BY PERSON, DVALUE
PERSON, DVALUE, STARTYEAR, ENDYEAR
2, 100, 1, 5
2, 150, 3, 3
Try something like this:
drop table #test
SELECT 1 YEAR, 100 DVALUE, 2 PERSON
INTO #TEST
UNION ALL
SELECT 2 YEAR, 100 DVALUE, 2 PERSON
UNION ALL
SELECT 3 YEAR, 150 DVALUE, 2 PERSON
UNION ALL
SELECT 4 YEAR, 100 DVALUE, 2 PERSON
UNION ALL
SELECT 5 YEAR, 100 DVALUE, 2 PERSON
select person, dvalue, MIN(year) AS startyear, max(year) AS endyear
from (
select count(case when prevvalue <> dvalue then 1 end) over(partition by person order by year) as cnt
, *
from (
select *
, LAG(DValue) OVER(PARTITION BY Person ORDER BY Year) AS prevValue
from #TEST
) x
) x
group by person, dvalue, cnt
order by person, MIN(year)
It's a very standard LAG / COUNT technique which finds "groups" of same values as previous ones. After you have that, it's easy to GROUP BY and get the edges