I use below query to find the median for every sector
SELECT DISTINCT Sector,
PERCENTILE_DISC(0.5) WITHIN
GROUP (ORDER BY Value) OVER (PARTITION BY sector) AS Median
FROM TABLE
The table is in below format
Sector Date Value
A 2014-08-01 1
B 2014-08-01 5
C 2014-08-01 7
A 2014-08-02 6
B 2014-08-02 5
C 2014-08-02 4
A 2014-08-03 3
B 2014-08-03 9
C 2014-08-03 6
A 2014-08-04 5
B 2014-08-04 8
C 2014-08-04 9
A 2014-08-05 5
B 2014-08-05 7
C 2014-08-05 2
So I get the expected result as below
Sector Median
A 5
B 7
C 6
Now I need to change the process such that the Medians are calculated while only considering the records upto the given date. So the new result would be
Sector Date Value
A 2014-08-01 1
B 2014-08-01 5
C 2014-08-01 7 (Only 1 record each was considered for A, B and C)
A 2014-08-02 3.5
B 2014-08-02 5
C 2014-08-02 5.5 (2 records each was considered for A, B and C)
A 2014-08-03 3
B 2014-08-03 5
C 2014-08-03 6 (3 records each was considered for A, B and C)
A 2014-08-04 4
B 2014-08-04 6.5
C 2014-08-04 6.5 (4 records each was considered for A, B and C)
A 2014-08-05 5
B 2014-08-05 7
C 2014-08-05 6 (All 5 records each was considered for A, B and C)
So this will be sort of a cumulative median. Can someone please tell me how to achieve this. My table has about 2.3M records with about 1100 records each for about 1100 dates.
Please let me know if you need any info.
That makes it harder, because the following does not work:
SELECT DISTINCT Sector, Date,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Value) OVER (PARTITION BY sector ORDER BY DATE) AS Median
FROM TABLE;
Alas. You can use cross apply
for this purpose:
select t.sector, t.date, t.value, m.median
from table t cross apply
(select top 1 PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY t2.Value) OVER (PARTITION BY sector ORDER BY t2.DATE) AS Median
from table t2
where t2.sector = t.sector and t2.date <= t.date
) m;