Search code examples
algorithmreddit

Why certain variables are used in "hotness" calculations?


I recently read a blog post about the Reddit "hotness" formula. The formula shown below seems to be the one used. There are a couple of variables I don't understand why they would be picked though. I plan on using this formula as a reference for an app I am involved in, so I'd like to know the basis around why these variables were used.

1st Dec 8, 2005 - Why use this date? Also, why use an offset time at all? Why not use epoch? Was this an arbitrary date used so that it is platform independent?

2nd - 45000 - Why use 45000 as a divisor? Is this an arbitrary number or does it have a specific meaning or purpose?

t = (time of entry post) - (Dec 8, 2005)
x = upvotes - downvotes

y = {1 if x > 0, 0 if x = 0, -1 if x < 0)
z = {1 if x < 0, otherwise x}

log(z) + (y * t)/45000

Solution

  • 1st Dec 8, 2005 - Why use this date? Also, why use an offset time at all? Why not use epoch? Was this an arbitrary date used so that it is platform independent?

    I suspect this was the "epoch" date for Reddit's original code. This would make it a good choice, as it keeps the t variable starting closer to zero, which would keep the functions more stable.

    2nd - 45000 - Why use 45000 as a divisor? Is this an arbitrary number or does it have a specific meaning or purpose?

    This is effectively a scaling function for time. The larger this number, the less effect age has on the overall equation. I suspect 45000 was chosen after some testing and found to provide a reasonable decay rate given the chosen epoch.