I'm trying to convert this median calculation in MySQL to SQL Server.
http://danielsetzermann.com/howto/how-to-calculate-the-median-per-group-with-mysql/
Here is the code:
SET @row_number:=0;
SET @median_group:='';
SELECT
median_group, AVG(height) AS median
FROM
(SELECT
@row_number:=
CASE
WHEN @median_group = gender THEN @row_number + 1
ELSE 1
END AS count_of_group,
@median_group:=gender AS median_group,
gender,
height,
(SELECT
COUNT(*)
FROM heights
WHERE a.gender = gender)
AS total_of_group
FROM
(SELECT
gender,
height
FROM heights
ORDER BY gender, height)
AS a)
AS b
WHERE
count_of_group BETWEEN total_of_group / 2.0 AND total_of_group / 2.0 + 1
GROUP BY median_group
This is as far as I got:
DECLARE @row_number integer, @median_group varchar(100)
SET @row_number = 0
SET @median_group = ''
SELECT
@row_number = (CASE WHEN @median_group = gender THEN @row_number + 1 ELSE 1 END) AS count_of_group,
(@median_group = gender) AS median_group,
gender,
height,
(SELECT COUNT(*)
FROM heights
WHERE a.gender = gender)
AS total_of_group
FROM
(SELECT
gender,
height
FROM heights)
AS a
ORDER BY a.gender , a.height
Sadly I don't know how to get past this error:
A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations.
I know there are a few other ways to calculate median that are SQL Server specific, but not knowing how to make this conversion is eating at me.
Why would you convert that code when you can do something like this?
select distinct gender, percentile_cont(0.5) over (partition by gender order by height)
from heights;
SQL Server has many other approaches for such a calculation.