Search code examples
pythonsampling

Sampling real numbers with sum and minimum value constraints


How can I sample N random values such that the following constraints are satisfied?

  1. the N values add up to 1.0
  2. none of the values is less than 0.01 (or some other threshold T << 1/N)

The following procedure was my first attempt.

def proportions(N):
    proportions = list()
    for value in sorted(numpy.random.random(N - 1) * 0.98 + 0.01):
        prop = value - sum(proportions)
        proportions.append(prop)
    prop = 1.0 - sum(proportions)
    proportions.append(prop)
    return proportions

The * 0.98 + 0.01 bit was intended to enforce the ≥ 1% constraint. This works on the margins, but doesn't work internally—if two random values have a distance of < 0.01 it is not caught/corrected. Example:

>>> numpy.random.seed(2000)                                                                                            
>>> proportions(5)                                                                                                     
[0.3397481983960182, 0.14892479749759702, 0.07456518420712799, 0.005868759570153426, 0.43089306032910335]

Any suggestions to fix this broken approach or to replace it with a better approach?


Solution

  • You could adapt Mark Dickinson's nice solution:

    import random
    
    def proportions(n):
        dividers = sorted(random.sample(range(1, 100), n - 1))
        return [(a - b) / 100 for a, b in zip(dividers + [100], [0] + dividers)]
    
    print(proportions(5))
    # [0.13, 0.19, 0.3, 0.34, 0.04]
    # or
    # [0.31, 0.38, 0.12, 0.05, 0.14]
    # etc
    

    Note this assumes "none of the values is less than 0.01" is a fixed threshold

    UPDATE: We can generalize if we take the reciprocal of the threshold and use that to replace the hard-coded 100 values in the proposed code.

    def proportions(N, T=0.01):
        limit = int(1 / T)
        dividers = sorted(random.sample(range(1, limit), N - 1))
        return [(a - b) / limit for a, b in zip(dividers + [limit], [0] + dividers)]