Could really use a hand from someone on this mdx query. I am trying to produce a rolling median for the last 365 days on per user and per day basis. I need the median to be the median per user response days. It seems like a simple computation but I cannot see how to get it to work with the crossjoin in the mix. Any help would be so very appreciated! If you even have a suggestion on a direction to attack this from that would be great.
SET [2Years] AS
'{[FirstOrderDate].[Full Date].&[2010-01-15T00:00:00]:[FirstOrderDate].[Full Date].[2012-08-20T00:00:00]}'
MEMBER [Measures].[2YearMedianLag]
AS
median({[FirstOrderDate].[Full Date].currentmember.lag(365):[FirstOrderDate].[FullDate].currentmember} , [Measures].[Response Days])
SELECT {[Measures].[Response Days], [Measures].[MedianLag]} ON 0,
NonEmpty(crossjoin( [days],
[User].[User ID].children),[Measures].[Response Days]) ON 1
FROM [UserRevenue]
Thank you in advance for your assistance.
EDIT:
SampleData (UserName varchar(100) null, FirstOrderDate Datetime null, ResponseDays int null)
('Jim', '2001-01-03', 10)
('Fred', '2001-01-03', 80)
('Frank', '2001-01-04', 30)
('Sally', '2001-01-05', 18)
('Joan', '2001-01-06', 26)
('Bill', '2001-01-06', 15)
('Ted', '2001-01-08', 29)
('Sam', '2001-01-10', 9)
('Jane', '2001-01-17', 200)
SampleOutput (FirstOrderDate datetime null, MedianResponseDays int null)
('2001-01-03', 45)
('2001-01-04', 30)
('2001-01-05', 24)
('2001-01-06', 22)
('2001-01-07', 22)
('2001-01-08', 26)
('2001-01-09', 26)
('2001-01-10', 22)
('2001-01-11', 22)
('2001-01-12', 22)
('2001-01-13', 22)
('2001-01-14', 22)
('2001-01-15', 22)
('2001-01-16', 22)
('2001-01-17', 26)
It's tricky because you need to work with a different set of rolling dates per day on rows. Are you sure you want 365 for the lag? That gives you 1 year plus 1 day. Anyway, this technique uses an inline named set to create a named set for each combination of user/date and assigns it a unique number, then you can pull that named set back out again in a StrToSet function to match up with the current row's user and dates. This version factors in each individual user:
with
set Users as [User].[User ID].Children
set UsersDates as NonEmpty((Users, [FirstOrderDate].[Full Date].children), [Measures].[Response Days])
set [Rolling Period] as
Generate(
UsersDates,
StrToSet(
"{[FirstOrderDate].[Full Date].currentmember.lag(364): [FirstOrderDate].[Full Date].currentmember} as RP" + CStr(UsersDates.CurrentOrdinal)
)
)
member [Measures].[Median Lag] as
median(
StrToSet("RP" +
CStr(Rank(([User].[User ID].CurrentMember, [FirstOrderDate].[Full Date].CurrentMember), UsersDates)))
, [Measures].[Response Days])
select
{
[measures].[Response Days]
, [measures].[Median Lag]
}
on columns,
UsersDates
on rows
from UserRevenue
UPDATE #1: This version ignores the individual user and instead uses the response for all users for the applicable set of dates:
with
set Users as [User].[User ID].Children
set Dates as NonEmpty([FirstOrderDate].[Full Date].children, [Measures].[Response Days])
set [Rolling Period] as
Generate(
Dates,
StrToSet(
"{[FirstOrderDate].[Full Date].currentmember.lag(364): [FirstOrderDate].[Full Date].currentmember} as RP"
+ CStr(Dates.CurrentOrdinal)
)
)
member [Measures].[Median Lag] as
median(
StrToSet("RP" +
CStr(Rank([FirstOrderDate].[Full Date].CurrentMember, Dates)))
, ([Measures].[Response Days], [User].[User ID].[All]))
select
{
[measures].[Response Days]
, [measures].[Median Lag]
}
on columns,
(Users, Dates)
on rows
from UserRevenue
UPDATE #2: Third time's a charm? Here's a query that gets me the results in your sample output. The key is that the set needs to generate a tuple for each date/user combination for the current date and store that as an inline named set, one per possible date which is uniquely identified by rank. So the first date (1/3) is rank 1, second date (1/4) is rank 2 etc when you look at the list of dates on rows. The first date 1/3/2001 has two items in the set - one with Jim for 1/3 and one with Fred for 1/3. So in the median calculation, the response days for each item in the related set need to be used. Because 1/3 is rank 1 in the list dates, the set called RP1 is retrieved, combined with response days for the items in the set (Jim and Fred) and the median is calculated. Then the next date, 1/4, contains three items - the same as for 1/3 but now also Frank for 1/4, so that requires a recalculation of the median and so on.
with
set Users as [User].[User ID].Children
set Dates as [FirstOrderDate].[Full Date].children
set [Rolling Period] as
Generate(
Dates,
StrToSet(
"NonEmpty(({[FirstOrderDate].[Full Date].currentmember.lag(364): [FirstOrderDate].[Full Date].currentmember}
, Users), [Measures].[Response Days]) as RP"
+ CStr(Dates.CurrentOrdinal)
)
)
member [Measures].[Median Lag] as
median(
StrToSet("RP" +
CStr(Rank([FirstOrderDate].[Full Date].CurrentMember, Dates)))
, [Measures].[Response Days])
select
{[measures].[Median Lag]}
on columns,
Dates
on rows
from UserRevenue