Search code examples
sqlsql-serversql-server-2008mediancommon-table-expression

Compute median of column in SQL common table expression


In MSSQL2008, I am trying to compute the median of a column of numbers from a common table expression using the classic median query as follows:

WITH cte AS
(
   SELECT number
   FROM table
) 

SELECT cte.*,
(SELECT 
  (SELECT (   
    (SELECT TOP 1 cte.number  
     FROM     
     (SELECT TOP 50 PERCENT cte.number     
      FROM cte
      ORDER BY cte.number) AS medianSubquery1   
    ORDER BY cte.number DESC)  
    +   
  (SELECT TOP 1 cte.number
   FROM     
    (SELECT TOP 50 PERCENT cte.number    
     FROM cte   
     ORDER BY cte.number DESC) AS medianSubquery2   
   ORDER BY cte.number ASC) ) / 2)) AS median

FROM cte
ORDER BY cte.number

The result set that I get is the following:

NUMBER    MEDIAN
x1        x1
x1        x1
x1        x1
x2        x2
x3        x3

In other words, the "median" column is the same as the "number" column when I would expect the median column to be "x1" all the way down. I use a similar expression to compute the mode and it works fine over the same common table expression.


Solution

  • The problem with your query is that you are doing

    SELECT TOP 1 cte.number FROM...

    but it isn't correlated with the sub query it is correlated with the Outer query so the subquery is irrelevant. Which explains why you simply end up with the same value all the way down. Removing the cte. (as below) gives the median of the CTE. Which is a constant value. What are you trying to do?

    WITH cte AS
        ( SELECT NUMBER
        FROM master.dbo.spt_values
        WHERE TYPE='p'
        )
    
    SELECT cte.*,
    (SELECT 
      (SELECT (   
        (SELECT TOP 1 number  
         FROM     
         (SELECT TOP 50 PERCENT cte.number     
          FROM cte
          ORDER BY cte.number) AS medianSubquery1   
        ORDER BY number DESC)  
        +   
      (SELECT TOP 1 number
       FROM     
        (SELECT TOP 50 PERCENT cte.number    
         FROM cte   
         ORDER BY cte.number DESC) AS medianSubquery2   
       ORDER BY number ASC) ) / 2)) AS median
    FROM cte
    ORDER BY cte.number
    

    Returns

    NUMBER      median
    ----------- -----------
    0           1023
    1           1023
    2           1023
    3           1023
    4           1023
    5           1023
    6           1023
    7           1023