Search code examples
pythonalgorithmstatisticsranking

Python implementation of the Wilson Score Interval?


After reading How Not to Sort by Average Rating, I was curious if anyone has a Python implementation of a Lower bound of Wilson score confidence interval for a Bernoulli parameter?


Solution

  • Reddit uses the Wilson score interval for comment ranking, an explanation and python implementation can be found here

    #Rewritten code from /r2/r2/lib/db/_sorts.pyx
    
    from math import sqrt
    
    def confidence(ups, downs):
        n = ups + downs
    
        if n == 0:
            return 0
    
        z = 1.0 #1.44 = 85%, 1.96 = 95%
        phat = float(ups) / n
        return ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))