Search code examples
sqlsql-servert-sqlssas

Calculate range of percentiles in Sql Server 2014


I'm using Sql Server 2014, I can get all the add-ons, SSAS, whatever's needed.

I have a table that for simplicity contains a single column with integers. It's safe to assume it contains 10^5 - 10^6 rows.

I want to calculate the 5th, 10th, 15th...90th, 95th percentile.

I can do it quite easily by calling percentile_cont (or _disc) 18 times. Unfortunatelly, it takes forever.

The thing is, I'm quite certain those calls do the same thing over and over again.

Is there a way to calculate a range of percentiles over a table faster?

I can easily do this in C#, by loading all rows into memory, ordering them, and just asking for 0.05*array.Length item, 0.1*array.Length item etc. - and it's obviously blazingly fast. I can of course replicate that by using a table variable or a temporary table, but I'm surprised there isn't some built in way.


Solution

  • One method is to use ntile() and aggregation:

    select nt, min(num), max(num), count(*)
    from (select t.*, ntile(20) over (order by num) as nt
          from t
         ) t
    group by nt
    order by nt;
    

    I should also add that SQL Server should make use of an index on the column for the window functions. So, you might be able to speed up your approach just by adding an index.