Search code examples
phpmysqlperformancelevenshtein-distance

How to count the number of (number of duplicates) in a database


I am writing some PHP / MySQL to detect excessive site visits. While I find it simple enough to detect, for any given IP address, how many times that person has visited in say 24 hours and so whether they have exceeded some number that is the visit maximum, what I am struggling with is some sql that will pick out the overall trend - how many IP's have each visited more than the threshold number of times in the period. Please consider the following simplified database table which I hope will explain more clearly.

 VisidID        VisitIP      VisitDate
 ---------------------------------- 
    1           n.n.n.01     (yesterday) 
    2           n.n.n.02     (today) 
    3           n.n.n.01     (today) 
    4           n.n.n.03     (today) 
    5           n.n.n.03     (today) 
    6           n.n.n.03     (today) 
    7           n.n.n.04     (today) 
    8           n.n.n.05     (today) 
    9           n.n.n.04     (today) 
    10          n.n.n.06     (today) 
    11          n.n.n.05     (today) 
    12          n.n.n.07     (today)
     ------------------------------------

So, assuming I set the maximum allowed visit threshold to 2 (just to keep my example table here small) the result I am looking for from this table query is '3' since there were three IP addresses (n.n.n.03, n.n.n.04 and n.n.n.05) whose visits today each equalled or exceeded the visit threshold of 2 visits.

How would I capture that in a MySQL query, and would it be a fast and efficient or burdonsome query to run?


Solution

  • Using the HAVING clause, this query will return those having a count higher than 2 -

    SELECT `VisitIP`, COUNT(`VisitIP`) AS `CountIP`
    FROM `table`
    GROUP BY `VisitIP`
    HAVING `CountIP` > 2
    

    To count the total meeting the condition you can do something like this -

    SELECT COUNT(*) AS `TotalOverTwo` FROM
        (SELECT `VisitIP`, COUNT(`VisitIP`) AS `CountIP`
         FROM `table`
         GROUP BY `VisitIP`
         HAVING `CountIP` > 2 
        ) `table`