Search code examples
sqldatabasecountdistinctlogparser

Count(Distinct x) and Group By y


I have entries with date and time. I want the results to be grouped by the hour (00, 01, 02) and that works, but when i want to get the distinct counts of users, there is an error.

Select Substr(time, 0, 2) as Hour,
 Count(date) as Hits,
 Count(Distinct ip) as Users,
 Count(Distinct X-Forwarded-For) as ForwardedUsers
From table 
Group By Hour

EDIT: I am using the LogParser from Microsoft and i am able to use Group By Hour as it is and X-Forwarded-For is also no problem. The question is how i can use Count(Distinct ip) within the group by


Solution

  • Unfortunately LogParser does not support DISTINCT aggregate functions together with GROUP BY. This should have been clear from the error message you get with the query above:

    Error: Semantic Error: aggregate functions with DISTINCT arguments are not supported with GROUP BY clauses

    One trick you could do is to remove the GROUP BY clause altogether and calculate instead COUNT DISTINCT(hourlyIp) where hourlyIp is a string built concatenating the hour with the IP address. You'll have then to breakup the hourlyIp field back into its components when you process the results.