I want to calculate the median and average for given coordinates in a specific radius.
The important attributes are: - latitude - longitude - price
The sql command to calculate the average is:
SELECT avg(price) as average
FROM (SELECT r.*,
( 6371 * acos( cos( radians(37.3541079) ) * cos( radians( ANY_VALUE(`latitude` )) ) * cos( radians( ANY_VALUE(`longitude`) ) - radians(-121.9552356) ) + sin( radians(37.3541079) ) * sin( radians( ANY_VALUE(`latitude`) ) ) ) ) AS distance
FROM `Rental` r
) r
WHERE distance <= 20;
My question is how can I calculate the median for the price in the given coordinates and radius. MySQL has no median() function.
EDIT: Now I have tried the code from Simple way to calculate median with MySQL
SELECT AVG(middle_values) AS 'median' FROM (
SELECT t1.price AS 'middle_values' FROM
(
SELECT @row:=@row+1 as `row`, x.price
FROM rental AS x, (SELECT @row:=0) AS r
WHERE 1
-- put some where clause here
ORDER BY x.price
) AS t1,
(
SELECT COUNT(*) as 'count'
FROM rental x
WHERE 1
-- put same where clause here
) AS t2
-- the following condition will return 1 record for odd number sets, or 2 records for even number sets.
WHERE t1.row >= t2.count/2 and t1.row <= ((t2.count/2) +1)) AS t3;
It works for all the 200'000 records, but when I will add the WHERE distance <= 20
is the mysql - request overloaded.
SELECT AVG(middle_values) AS 'median' FROM (
SELECT t1.price AS 'middle_values' FROM
(
SELECT @row:=@row+1 as `row`, x.price
FROM rental AS x, (SELECT @row:=0) AS r, (SELECT a.*,
( 6371 * acos( cos( radians(37.3541079) ) * cos( radians( ANY_VALUE(`latitude` )) ) * cos( radians( ANY_VALUE(`longitude`) ) - radians(-121.9552356) ) + sin( radians(37.3541079) ) * sin( radians( ANY_VALUE(`latitude`) ) ) ) ) AS distance
FROM `Rental` a
) a
WHERE distance <= 20
-- put some where clause here
ORDER BY x.price
) AS t1,
(
SELECT COUNT(*) as 'count'
FROM rental x, (SELECT a.*,
( 6371 * acos( cos( radians(37.3541079) ) * cos( radians( ANY_VALUE(`latitude` )) ) * cos( radians( ANY_VALUE(`longitude`) ) - radians(-121.9552356) ) + sin( radians(37.3541079) ) * sin( radians( ANY_VALUE(`latitude`) ) ) ) ) AS distance
FROM `Rental` a
) a
WHERE distance <= 20
-- put same where clause here
) AS t2
-- the following condition will return 1 record for odd number sets, or 2 records for even number sets.
WHERE t1.row >= t2.count/2 and t1.row <= ((t2.count/2) +1)) AS t3;
Is there somewhere a misstep?
The problem is with the table scan to compute the distances, not with the median.
TEMPORARY TABLE
so you don't have to evaluate it 3 times (avg, count, and median).WHERE
to limit the checks to a 20x20 "square".INDEX(latitude)
HAVING distance < 20
instead of needing yet-another subquery.