Search code examples
sqloracle-databaseplsql

cume_dist vs percent_rank or difference between


In the following query i return different result but can't figure out the what are this two functions.

select firstname,gender,weight,height,
  cume_dist() over (order by height) cd,
  percent_rank() over (order by height) pr
from childstat order by height

FIRSTNAME                                          GENDER     WEIGHT     HEIGHT         CD         PR
-------------------------------------------------- ------ ---------- ---------- ---------- ----------
buddy                                              m             150         15 0.2857142857          0 
Albert                                             m             923         15 0.2857142857          0 
rosemary                                           f             123         35 0.4285714286 0.3333333333 
lauren                                             f             876         54 0.5714285714        0.5 
furkar                                             m             198         76 0.7142857143 0.6666666667 
tommy                                              m             167         78 0.8571428571 0.8333333333 
simon                                              m             256         87          1          1 

Solution

  • Are you asking what these functions do?

    http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions035.htm

    CUME_DIST calculates the cumulative distribution of a value in a group of values.

    In your example this means that ~29% has less or equal height than buddy. ~57% has less or equal height than lauren. Etc.

    http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions109.htm

    PERCENT_RANK is similar to the CUME_DIST (cumulative distribution) function. The range of values returned by PERCENT_RANK is 0 to 1, inclusive. The first row in any set has a PERCENT_RANK of 0. The return value is NUMBER.