Search code examples
mysqloptimizationquery-optimizationranking

MySQL ranking items: is it better to count rows in a subquery or COUNT(*) WHERE foo > bar?


The problem

I'm looking at the ranking use case in MySQL but I still haven't settled on a definite "best solution" for it. I have a table like this:

CREATE TABLE mytable (
  item_id int unsigned NOT NULL,
  # some other data fields,
  item_score int unsigned NOT NULL,
  PRIMARY KEY (item_id),
  KEY item_score (item_score)
) ENGINE=MyISAM;

with some millions records in it, and the most common write operation is to update item_score with a new value. Given an item_id and/or its score, I need to get its ranking, and I currently know two ways to accomplish that.

COUNT() items with higher scores

SELECT COUNT(*) FROM mytable WHERE item_score > $foo;

assign row numbers

SET @rownum := 0;
SELECT rank FROM (
    SELECT @rownum := @rownum + 1 AS rank, item_id
    FROM mytable ORDER BY item_score DESC ) AS result
WHERE item_id = $foo;

which one?

Do they perform the same or behave differently? If so, why are they different and which one should I choose?

any better idea?

Is there any better / faster approach? The only thing I can come up with is a separate table/memcache/NoSQL/whatever to store pre-calculated rankings, but I still have to sort & read out mytable every time I update it. That makes me think it would be a good approach only if the number of "read rank" queries is (much?) greather than the number of updates, on the other hand it should be less useful with the "read rank" queries approaching the number of update queries.


Solution

  • Since you have indexes on your table the only queries to use that makes sense is

    -- findByScore    
    SELECT COUNT(*) FROM mytable WHERE item_score > :item_score; 
    -- findById
    SELECT COUNT(*) FROM mytable WHERE item_score > (select item_score from mytable where item_id = :item_id); 
    

    on findById since you only need rank of 1 item id, it is not much different from join counterpart on performance wise.

    If you need the rank of many items then using join is better.

    Usign "assign row numbers" can not compete here because it wont make use of indexes (in your query not at all and if we would even improve that it is still not as good)

    Also there may be some hidden traps using the assign indexes: if there are multiple items with same score then it will give you rank of last one.

    Unrelated: And please use PDO if possible to be safe from sql injections.