Search code examples
sqlsql-serversql-server-2008distinct-values

Top 10 occuring values in a table


I'm trying to put together a SQL query to gather the top 10 looked-at news items within the past week. I also need it to filter the duplicate ip addresses that have looked at the same news item.

Each time a user enters a page the user's browser query string is taken.
Here's an example of the db setup:

datetime   | ipaddress     | querystring
-----------------------------------------
9/12/2011  | 65.65.65.651  | newsid=3512
9/12/2011  | 65.65.65.658  | newsid=3512
10/12/2011 | 65.65.65.653  | newsid=3514
11/12/2011 | 65.65.65.656  | newsid=3515
11/12/2011 | 65.65.65.651  | newsid=3515
13/12/2011 | 65.65.65.651  | newsid=3516
14/12/2011 | 65.65.65.650  | newsid=3516
14/12/2011 | 65.65.65.650  | newsid=3516

My failed attempt:

 SELECT DISTINCT TOP 10 ipaddress, querystring, Count(*) AS thecount
      FROM [thedb].[dbo].[tblwebstats] 
      WHERE querystring LIKE '%newsid=%' AND datetime > (1 week ago)
      GROUP BY querystring, ipaddress
      ORDER BY Count(*) DESC

Please help me out :)


Solution

  • How about something like this?

    select top 10 querystring, count(querystring) as popularity
    from 
    (
        select distinct ipaddress, querystring
        from 
        (
            select [datetime], ipaddress, querystring
            from tblwebstats
            where querystring LIKE '%newsid=%' AND [datetime] > dateadd(day, -7, getdate())
        ) as datefilter
    ) as distinctfilter
    group by querystring
    order by popularity desc
    

    This query does the following (innermost to outermost):

    1. Filters the original table by date range and querystring as required
    2. Reduces the results from (1) down to distinct pairs of (IP address, querystring), ignoring date
    3. Counts the unique querystring occurrences from (2) and returns the top 10 of them in descending order by count.