Here's a table example:
Table: websites
Domain | Category | Popularity
google.com | 0 | 13,430
yahoo.com | 0 | 1,094
facebook.com | 1 | 23,943
youtube.com | 2 | 17,320
imdb.com | 3 | 6,094
stackoverflow.com | 4 | 2,930
ebay.com | 5 | 5,748
yandex.com | 5 | 3,748
I'd like to return the top N results by excluding categories CE = [C1, C2, C3 ... Cn] ordered by popularity.
For example:
The TOP 3 results by excluding search engines CE = [0] ordered by popularity are:
Domain | Category | Popularity
facebook.com | 1 | 23,943
youtube.com | 2 | 17,320
imdb.com | 3 | 6,094
The TOP 3 results by excluding all categories except search engines CE = [1, 2, 3, 4, 5] ordered by popularity are:
Domain | Category | Popularity
google.com | 0 | 13,430
yahoo.com | 0 | 1,094
Let's work on the last query.
I expect it to look like this:
SELECT domain, category, popularity FROM (SELECT domain, category, popularity FROM websites WHERE category != [1, 2, 3, 4, 5] ORDER BY popularity) LIMIT 0, 3
I don't care if the response is ordered or not by popularity. That's something I can redo client-side.
I know this is core-level stuff but is there any way of speeding it up? It is after all a whole table scan!
You can do this with a LIMIT
clause, all you have to do is make sure your WHERE
clause filters out categories you don't want. Try this:
SELECT *
FROM myTable
WHERE category NOT IN(category, list)
ORDER BY popularity DESC
LIMIT 3;
The way you'll pass in the list is by finding a way to build a comma separated list of the categories you want to filter out. For more information on the IN
operator, see this link.
Here is an SQL Fiddle example that shows both your sample cases.