Search code examples
algorithmoptimizationautosuggestcollaborative-filtering

How to optimize an database suggestion engine


I`m making an online engine for item-to-item recommending movies. I have made some researches and I think that the best way to implement that is using pearson correlation and make a table with item1, item2 and correlation fields, but the problem is that after each rate of item I have to regenerate the correlation for in the worst case N records (where N is the number of items).

Another think that I read is the following article, but I haven`t thought a way to implement it.

So what is your suggestion to optimize this process? Or any other suggestions? Thanks.


Solution

  • The current approach to this kind of "shopping cart" problems is to use Singular Value Decomposition (SVD). The top-3 participants in the NetFlix Prize all used SVD. SVD is used to perform "dimensionality reduction" on the huge products*persons covariance matrix. The good news is that incremental methods (adding a few observations to the dataset does not imply recomputing the entire matrix) exist.