Search code examples
phpmysqlmedian

Calculating Median for columns mysql


I have the following columns in mysql,

Area, Price, Type, Quarter, Year etc.

What I am trying to do, is calculate the median for every area (Juffair, Adliya, Mahooz) over these conditions.

A typical row would look something like this

Burhama, 500, Apartment, 3, 2014..
Burhama, 600, Apartment, 3, 2014
Juffair, 800, Apartment, 3, 2014 etc.

What I want it to do is display a table with (Area - Median). So it should take all of Juffair and calculate the median. etc.

What it is doing is: Displaying every single area(duplicate times) with it's price next to it.

Moving around the loops, I get it to display, only the area's, but calculates the medians, across all the area's, instead of just the one that it is suppose to.

Any help would be appreciated.

<?php
function calculate_median($arr) {
sort($arr);
$count = count($arr); //total numbers in array
$middleval = floor(($count-1)/2); // find the middle value, or the lowest middle value
if($count % 2) { // odd number, middle is the median
    $median = $arr[$middleval];
} else { // even number, calculate avg of 2 medians
    $low = $arr[$middleval];
    $high = $arr[$middleval+1];
    $median = (($low+$high)/2);
}
return $median;
}
?>


<?php

$array = array('Juffair', 'Adliya', 'Mahooz', 'Burhama' , 'Manama');
$price = array();
//connect
$data = mysql_query("SELECT * FROM stats 
WHERE `rentsale` = 'Rent'
and `type` = 'Apartment' 
and `area` IN('".implode("', '", $array)."') 
and `beds` = '2' 
and `quarter` = '3' 
and `year` = '2014'") 
//die
while($info = mysql_fetch_array( $data )) {

$area = $info['area'];
$array = array('Juffair', 'Adliya', 'Mahooz', 'Burhama' , 'Manama');
foreach( $array as $area ) {
$price[] = $info['price']; 
$home_values_array = $price;
$median_home_value = calculate_median($home_values_array);
}
?>
<table>
<tr>
<TD><?php echo $area; ?></TD>
<TD><?php echo $median_home_value; ?></TD>
</tr>
</table>
<?php

}


 ?> 

Solution

  • If you want to calculate the median in MySQL, I would suggest using variables (which neither of the referenced answers suggests):

    select area, avg(price) as median
    from (select s.*,
                 (@rn := if(@a = area, @rn + 1,
                            if(@a := area, 1, 1)
                           )
                 ) seqnum
          from stats s cross join
               (select @a := '', @rn := 0) vars
          order by area, price
         ) s join
         (select area, count(*) as cnt
          from stats
          group by area
         ) a
         on s.area = a.area
    where 2*seqnum in (cnt - 1, cnt, cnt + 1)
    group by area;
    

    The above query calculates the median for all the data. The first subquery enumerates the values, and the second gets the total count.

    If you want a filter, such as as by year, then you need to add it to both subqueries.