I'm trying to implement the reddit algorithm as a sorting option in my app but I'm constantly hitting walls all over the place.
I started my implementation using this (Sorting mongodb by reddit ranking algorithm) post as a guide line.
I tried to convert it to c#; below is my attempt at the conversion.
var map = new BsonJavaScript(
@"function() {
function hot(score, date){
var order = log10(Math.max(Math.abs(score), 1));
var sign = score>0 ? 1 : score<0 ? -1 : 0;
var seconds = epochSeconds(date) - 1134028003;
var product = order + sign * seconds / 45000;
return Math.round(product*10000000)/10000000;
}
function log10(val){
return Math.log(val) / Math.LN10;
}
function epochSeconds(d){
return (d.getTime() - new Date(1970,1,1).getTime())/1000;
}
emit( hot(this.VoteCount, this.CreatedAt), this );
}"
);
var reduce = new BsonJavaScript(
@"function(){}"
);
var finalize = new BsonJavaScript(
@"{ 'out': { 'inline': 1 } }"
);
return db.Posts.MapReduce(new MapReduceArgs { MapFunction = map, ReduceFunction = reduce, FinalizeFunction = finalize }).GetResults();
He's the results I'm getting from the implementation;
He's the actual dataset.
For some reason the function returns 2 objects instead of 4. Also, what would I need to modify for the function to return the entire post object along with the calculated score?
Would really appreciate it if someone would help me out :)
Thanks in advance, Jean
Fixed it by making 2 modifications.
These 2 resources where extremely helpful; http://docs.mongodb.org/manual/reference/command/mapReduce/#mapreduce-map-cmd http://docs.mongodb.org/manual/reference/method/db.collection.mapReduce/#db.collection.mapReduce
Firstly I changed what parameters I pass to emit. I'm assigning a "score" value to the post object on the fly and run the hot function on it. Then I pass the key parameter for emit as the object key and the value parameter as the new post object with the score value. **emit(key, value)
this.score = hot(this.VoteCount, this.CreatedAt);
emit( this._id, this );
Then I changed how I get the results to;
db.Posts.MapReduce(new MapReduceArgs { MapFunction = map, ReduceFunction = reduce}).InlineResults
Hope this is helpful someone else :)
I'll post benchmarks using this method of calculating the score to calculating the score in C# when I have some free time.
Alt implementation / Update: I switched to a simpler / faster decay algorithm used by Hacker News as it still meets my requirements. http://amix.dk/blog/post/19574
Score = (P-1) / (T+2)^G
where,
P = points of an item (and -1 is to negate submitters vote)
T = time since submission (in hours)
G = Gravity, defaults to 1.8 in news.arc