I'm working on a web application where the user provides parameters, and these are used to produce a list of the top 1000 items from a database of up to 20 million rows. I need all top 1000 items at once, and I need this ranking to happen more or less instantaneously from the perspective of the user.
Currently, I'm using a MySQL with a user-defined function to score and rank the data, then PHP takes it from there. Tested on a database of 1M rows, this takes about 8 seconds, but I need performance around 2 seconds, even for a database of up to 20M rows. Preferably, this number should be lower still, so that decent throughput is guaranteed for up to 50 simultaneous users.
I am open to any process with any software that can process this data as efficiently as possible, whether it is MySQL or not. Here are the features and constraints of the process:
Thanks
A feasible approach seems to be to load (and later update) all data into about 1GB RAM and perform the scoring and ranking outside MySQL in a language like C++. That should be faster than MySQL.
The scoring must be relatively simple for this approache because your requirements only leave a tenth of a microsecond per row for scoring and ranking without parallelization or optimization.