Search code examples
sqlpostgresqlherokuberkeley-db

Providing fast `select count(*)` functionality in a web app


I'm re-implementing an app to support a national engineering contest, moving it from a local server to the cloud.

In order to tell a team where they stand at the moment, the query has the form

select 1 + count(*) from team where where score < ?

Teams scores change very dynamically. There may be up to 2 million teams and I need to handle at least 10 of these queries per second.

The original gets the needed performance (actually it already did with 1999 harware) by using a separate Berkeley DB of team/scores records. There is a "record number" feature in Berkeley DB that provides exactly the right functionality, and it's very fast.

Heroku apparently has no way to support Berkeley DB. PostgreSQL, their standard DB, does select count(*) with a full table or index scan, which is way too slow.

Any ideas on how to proceed? I am not wedded to Heroku, but must move to a cloud solution of some kind.


Solution

  • Use redis to store your team data in a sorted set. Then the ZRANK function will return the count you need. Redis is very fast in general, and the ZRANK function is O(log N) expected. It is implemented with skip lists.