Search code examples
sqlwindow-functionspartitionpercentilepercentile-cont

SQL: How to perform PERCENTILE_CONT of group sum among groups?


Very simply, let's say I have want to SUM the total of a feature within a group.

So I partition over a window function:

SELECT sum(NumberOfPlayers) over (partition by sport) as SumOfSport

Great, this is good. Now.. I want to know the percentile of that sum among all the sums I just made.

A sport with 9 players, for example, among five sports, where it ranked 2nd of five, would be in the 40th percentile.

Let's start simple.. I want to know where the 20th percentile is.

... PERCENTILE_CONT(0.2) WITHIN GROUP (ORDER BY SumOfSport ASC) OVER (PARTITION BY NumberOfPlayers) AS AVGPV20

But this fails. You cannot use a previous window function within the SELECT.

So how do we use PERCENTILE_CONT here without having to do a join?


Solution

  • Use a subquery:

    select sport, sum(NumberOfPlayers),
           percentile_cont(0.2) within group (order by   sum(NumberOfPlayers) asc) over () AS avgpv20
    from t
    group by sport;
    

    Then join this result back in to your original query.

    Trying to do the percentile_cont() without aggregating the data is going to be quite challenging -- because individual values are replicated. It is simpler just to aggregate and join back in.