I have a table rel
with fields id
refid
cellid
cat
and some working PHP code to count the number of rows for each cellid
and return this as an array for later use.
<?php
// connect to database
require($DOCUMENT_ROOT . "connect.php");
// count the number of relationships for each cell id
$cell_array = array();
if ($result = mysqli_query($link, "SELECT cellid, COUNT(*) totalcount, cat FROM rel GROUP BY cellid")) {
if (mysqli_num_rows($result)) {
while ($row = mysqli_fetch_assoc($result)) {
$cell_array[$row["cellid"]] = $row['totalcount'];
}
}
}
?>
I now want to add an additional function, calculate the Mode (most common value) of cat
for each cellid
and include that value in the array. If there is more than one mode or no mode then return 9 (cat
can only have values of 1 to 8).
I've been reading around how to find the Mode and the common solution looks like this:
SELECT COUNT(*) frequency, cat, cellid
FROM rel
GROUP BY cat
ORDER BY COUNT(*) DESC
LIMIT 1
But this does not return the most common cat
for each cellid
just the most common cat
in general.
I have been working using a sub-query
SELECT cellid, cat, COUNT( * )
FROM `rel`
GROUP BY cellid, cat
Which produces a row for each cellid
cat
combination, but I'm unsure how to then use this to find the total number of entries for each cellid
and the the most common cat
for each cellid
EDIT
I have made some further progress, I now have a working query to find the mode cat
for each cellid
SELECT cellid, cat
FROM rel t
GROUP BY cellid, cat
HAVING cat = (
SELECT cat
FROM rel
WHERE cellid = t.cellid
GROUP BY cat
ORDER BY COUNT( * ) DESC , cat
LIMIT 1 )
But I am still unsure how to combine the two queries are the output from one cannot be used in the other.
You can combine them by joining them together:
select cell.*, mode.cat as modecat
from (SELECT cellid, COUNT(*) totalcount
FROM rel
GROUP BY cellid
) cell join
(SELECT cellid, cat
FROM rel t
GROUP BY cellid, cat
HAVING cat = (SELECT cat
FROM rel
WHERE cellid = t.cellid
GROUP BY cat
ORDER BY COUNT( * ) DESC , cat
LIMIT 1
)
) mode
on cell.cellid = mode.cnt
This doesn't handle the case of multiple values. If you only wanted an arbitrary value, I would go with:
select cell.*,
(select cat
from rel
where cell.cellid = rel.cellid
group by cellid, cat
order by COUNT(*) desc
limit 1
) as mode
from (SELECT cellid, COUNT(*) totalcount
FROM rel
GROUP BY cellid
) cell
Identifying muliples is a bit trickier. It requires an additional aggregation in the subquery:
select cell.*,
(select (case when min(cat) = max(cat) then min(cat) else 'MULTIPLE' end) as mode
from (select cat, COUNT(*) as cnt
from rel
where cell.cellid = rel.cellid
group by cellid, cat
) t
group by cnt
order by cnt desc
limit 1
) mode
from (SELECT cellid, COUNT(*) totalcount
FROM rel
GROUP BY cellid
) cell