Search code examples
phpmysqlperformancerecords

How can I get the top N records of a MySQL table by excluding an array of groups?


Here's a table example:

Table: websites

Domain            | Category | Popularity

google.com        | 0        | 13,430
yahoo.com         | 0        | 1,094
facebook.com      | 1        | 23,943
youtube.com       | 2        | 17,320
imdb.com          | 3        | 6,094
stackoverflow.com | 4        | 2,930
ebay.com          | 5        | 5,748
yandex.com        | 5        | 3,748

I'd like to return the top N results by excluding categories CE = [C1, C2, C3 ... Cn] ordered by popularity.

For example:

The TOP 3 results by excluding search engines CE = [0] ordered by popularity are:

Domain            | Category | Popularity

facebook.com      | 1        | 23,943
youtube.com       | 2        | 17,320
imdb.com          | 3        | 6,094

The TOP 3 results by excluding all categories except search engines CE = [1, 2, 3, 4, 5] ordered by popularity are:

Domain            | Category | Popularity

google.com        | 0        | 13,430
yahoo.com         | 0        | 1,094

Let's work on the last query.

I expect it to look like this:

SELECT domain, category, popularity FROM (SELECT domain, category, popularity FROM websites WHERE category != [1, 2, 3, 4, 5] ORDER BY popularity) LIMIT 0, 3
  1. How do I pass an array of excluded categories as a WHERE statement?
  2. How do I order it by ascending / descending popularity order?
  3. Do I order it before or after I select the ones with the right category type (performance wise)?

I don't care if the response is ordered or not by popularity. That's something I can redo client-side.

I know this is core-level stuff but is there any way of speeding it up? It is after all a whole table scan!


Solution

  • You can do this with a LIMIT clause, all you have to do is make sure your WHERE clause filters out categories you don't want. Try this:

    SELECT *
    FROM myTable
    WHERE category NOT IN(category, list)
    ORDER BY popularity DESC
    LIMIT 3;
    

    The way you'll pass in the list is by finding a way to build a comma separated list of the categories you want to filter out. For more information on the IN operator, see this link.

    Here is an SQL Fiddle example that shows both your sample cases.