Search code examples
phpmysqlsqlrdbms

Limiting GROUP BY based on COUNT() values in mySQL


I'm logging events into a mySQL database and would like to get the top 3 events for monitoring purposes.

My table eventlog looks like this:

+----+------------------+---------------------+
| id |    eventname     |      eventdate      |
+----+------------------+---------------------+
|  0 | machine1.started | 2016-09-04 19:22:23 |
|  1 | machine2.reboot  | 2016-09-04 20:23:11 |
|  2 | machine1.stopped | 2016-09-04 20:24:12 |
|  3 | machine1.started | 2016-09-04 20:25:12 |
|  4 | machine1.stopped | 2016-09-04 23:23:16 |
|  5 | machine0.started | 2016-09-04 23:24:00 |
|  6 | machine1.started | 2016-09-04 23:24:16 |
|  7 | machine3.started | 2016-09-04 23:25:00 |
|  8 | machine4.started | 2016-09-04 23:26:00 |
|  9 | cluster.alive    | 2016-09-04 23:30:00 |
| 10 | cluster.alive    | 2016-09-05 11:30:00 |
+----+------------------+---------------------+

The query should eventually return the following, holding

  • the top 3 events that occurred most often (based on column eventcounts that is generated by mySQL's COUNT() function), grouped by their eventname
  • only 2 rows where eventcount = 1, but only if 1 is within the top 3 eventcounts (since there are a lot of events that occur just once and therefore would overload my frontend)

Example of the desired result, based on the above table:

+------------+------------------+
| eventcount |    eventname     |
+------------+------------------+
|          3 | machine1.started |
|          2 | machine1.stopped |
|          2 | cluster.alive    |
|          1 | machine0.started |
|          1 | machine2.started |
+------------+------------------+

Please note that I do not need just 3 returned rows but the rows with the 3 highest eventcounts.

I did a lot of experimenting by messing around with the query string below, including multiple selects and questionable CASE ... WHEN conditions, but wasn't able to make it work the way I need.

SELECT COUNT(id) AS 'eventcount', eventname
FROM eventlog
GROUP BY eventname
ORDER BY eventcount DESC;

What is the best approach to get the desired result in a performant way?


Solution

  • here is one way of doing it using variables SQL Fiddle for it: http://sqlfiddle.com/#!9/b3458b/16

    SELECT
      t2.eventcount
      ,t2.eventname
    FROM
    (
      SELECT
          t.eventname
          ,t.eventcount
          ,@Rank:=IF(@PrevCount=t.eventcount,@Rank,@Rank+1) Rank
          ,@CountRownum:=IF(@PrevCount=t.eventcount,@CountRowNum + 1,1) CountRowNum
          ,@PrevCount:= t.eventcount
        FROM
          (
            SELECT
              l.eventname
              ,COUNT(*) as eventcount
            FROM
              eventlog l
            GROUP BY
              l.eventname
            ORDER BY
              COUNT(*) DESC
          ) t
          CROSS JOIN (SELECT @Rank:=0, @CountRowNum:=0, @PrevCount:=-1) var
        ORDER BY
          t.eventcount DESC
    ) t2
    WHERE
      t2.Rank < 4
      AND NOT (t2.eventcount = 1 AND t2.CountRowNum > 2)