I am trying to calculate percentiles by group (from column values ex: hours by department, sales by region, etc.) within power query. This same logic could be used for other custom group aggregation. After lots of searching, I found 2 potential approaches.
Approach 1:
this archived article which looked to have the perfect answer. Nothing else I could find comes close.
The solution from there is the following custom function:
//PercentileInclusive Function
(inputSeries as list, percentile as number) =>
let
SeriesCount = List.Count(inputSeries),
PercentileRank = percentile * (SeriesCount - 1) + 1, //percentile value between 0 and 1
PercentileRankRoundedUp = Number.RoundUp(PercentileRank),
PercentileRankRoundedDown = Number.RoundDown(PercentileRank),
Percentile1 = List.Max(List.MinN(inputSeries, PercentileRankRoundedDown)),
Percentile2 = List.Max(List.MinN(inputSeries, PercentileRankRoundedUp)),
PercentileInclusive = Percentile1 + (Percentile2 - Percentile1) * (PercentileRank - PercentileRankRoundedDown)
in
PercentileInclusive
Combined with a step in your table to group appropriately and use the function:
=Table.Group(TableName, {"Grouping Column"}, {{"New Column name", each
PercentileInclusive(TableName[Column to calculate Percentile of], percentile # between 0 and 1)}})
[edited to correct the typo Ron R. pointed out and remove unnecessary detail]
Example input:
Pen Type | Units Sold |
---|---|
Ball-Point | 6,109 |
Ball-Point | 3,085 |
Ball-Point | 1,970 |
Ball-Point | 8,190 |
Ball-Point | 6,006 |
Ball-Point | 2,671 |
Ball-Point | 6,875 |
Roller | 778 |
Roller | 9,329 |
Roller | 7,781 |
Roller | 4,182 |
Roller | 2,016 |
Roller | 5,785 |
Roller | 1,411 |
Desired output for a 25% inclusive percentile grouped by Pen Type:
Pen Type | 0.25 Inclusive Percentile (Correct) |
---|---|
Ball-Point | 2,878 |
Roller | 1,714 |
Notes: No decimals shown above, calculated with Excel's PERCENTILE.INC function.
Approach 1 works great.
Approach 2: Here is an alternate Power Query solution I tried. It is a single step with no custom function. It seems like it should do the trick, but I can't figure out a way to make the conditional check be row based. Something needs to go where I have //Condition// that tells it which rows belong in the current rows group, but no matter what I try it does not work. It either breaks, or gives a percentile for everything, ignoring the grouping.
=List.Percentile(Table.Column(Table.SelectRows(#"Previous Step Name", //Condition//), "Column to calculate percentile of"), percentile # 0 to 1)
Any ideas how to make approach 2 work?
It appears your Table.Group
function is incorrectly specified.
Where my previous step was #"Changed Type"
, the following works:
#"Grouped Rows" = Table.Group(#"Changed Type", {"Pen Type"}, {
{"Percentile", each fnPercentileINC([Units Sold],0.25)}})
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Pen Type", type text}, {"Units Sold", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Pen Type"}, {
{"Percentile", each fnPercentileINC([Units Sold],0.25), type number}})
in
#"Grouped Rows"
Edit:
For your approach #2, without a custom function, you can merely use List.Percentile
as an aggregation in the Table.Group
function:
#"Grouped Rows" = Table.Group(#"Changed Type", {"Pen Type"}, {
{"25th Percentile", each List.Percentile([Units Sold],0.25)}
})