I have a very simple table (LOG) containing attributes MAC_ADDR, IP_SRC, IP_DST, URL, PROTOCOL. I want the first n lines containing IP_SRC, URL, #OfOccurrences ordered by decreasing #OfOccurrences for each IP_SRC in my table when PROTOCOL='DNS'.
To be clearer, I want to be able to list the first n most visited pages for each IP_SRC in my table.
I can get the most visited URL for each IP_SRC like this :
select ip_src,url,cnt
from (
select ip_src,url,count(*) as cnt,protocol
from log as b group by ip_src,url order by ip_src,cnt desc
) as c
where cnt>=(select MAX(cpt)
from (select count(*) as cpt from log as b
where c.ip_src==b.ip_src group by ip_src,url)
)
and protocol='DNS';
However, this solution is obviously not optimized.
Here is a more practical code (for the most visited URL for each IP_SRC) :
select ip_src,url,cnt
from (select ip_src,url,count(*) as cnt
from log where protocol='DNS'
group by ip_src,url
order by ip_src,cnt asc)
group by ip_src;
This second option is way more faster ! However, I want the n most visited pages for each IP_SRC, and I can't figure out how to do.
Thanks for your help.
Finally, with the use of a temporary table, I could manage to get what I wanted.
--First create a temp table of occurences
CREATE TEMPORARY TABLE TEMP1 AS
SELECT ip_src,url,count(*) AS cnt
FROM LOG
WHERE protocol='DNS'
GROUP BY ip_src,url
ORDER BY ip_src,cnt,url DESC;
--Then use a classic limit per group query
SELECT T1.ip_src,T1.url,T1.cnt
FROM TEMP1 AS T1
WHERE T1.url in (
SELECT T2.url
FROM TEMP1 AS T2
WHERE T2.ip_src=T1.ip_src and T2.cnt>=T1.cnt
ORDER BY T2.cnt DESC
LIMIT 3 --Or whatever you want it to be
)
ORDER BY T1.ip_src ASC,T1.cnt DESC;
If someone knows how to do the same without the need of a temp table (or explain me why a temp table is a good solution), please express yourself.