I am looking for a tricky MySQL query to return the most common column value for each value in another column. I could use PHP to do it by each result, but it'd be so much cooler to do it with a single query :)
For example, if have a table like this, called "transactions":
Id payee exp category
1 Amazon 25.00 Gifts
2 Amazon 30.21 Books
3 Amazon 12.98 Gifts
4 Amazon 15.00 Groceries
5 Amazon 14.54 Gifts
6 Alibaba 55.55 Stock
7 Alibaba 99.00 Stock
8 Alibaba 12.00 Fun
…
The type of result I would want is like this:
payee count(payee) category
Amazon 3 Gifts
Alibaba 2 Stock
…
I can do this:
SELECT `payee`, COUNT(`payee`), `category` FROM `transactions` WHERE 1 GROUP BY category ORDER BY COUNT(`payee`) DESC
and get close to what I want:
Amazon 3 Gifts
Alibaba 2 Stock
Amazon 1 Books
Amazon 1 Groceries
Alibaba 1 Fun
but I don't want the non-maximum counts (like Amazon,1,Books for example).
Do I have to do a subquery or something? Or use in?
You could filter the results of your existing query with a correlated subquery in a having
clause, as follows:
select payee, count(*), category
from transactions t
group by payee, category
having count(*) = (
select count(*)
from transactions t1
where t1.payee = t.payee
group by category
order by count(*) desc limit 1
)
order by count(*) desc
payee | count(*) | category :------ | -------: | :------- Amazon | 3 | Gifts Alibaba | 2 | Stock
Alernatively, if you are running MySQL 8.0, you can rank the categories of each payee with window function rank() over()
, and filter on the top record per group:
select payee, cnt, category
from (
select
payee,
count(*) cnt,
category,
rank() over(partition by payee order by count(*) desc) rn
from transactions
group by category, payee
) t
where rn = 1