Search code examples
pythonrandomscipydistribution

Continuous distribution in scipy that looks like two powerlaws?


I need to generate random samples from a distribution that looks roughly like this:

enter image description here

To generate the image above I combined two powerlaws:

y = a * x**(a - 1) + b * x**(b - 1)

with parameters a, b controlling the shape of the distribution. Ideally a single shape parameter would be better.

There are dozens of distributions defined in scipy's Continuous distributions section but I haven't been able to find one that matches what I need.


Solution

  • In this case, the answer is simple.

    The integral of a * x**(a - 1) + b * x**(b - 1) over the interval [0, 1] is simply 2, suggesting that this distribution is a mixture of two power-law distributions, one of which is drawn with the same probability as the other.* Specifically, your distribution has the following density function:

    (a * x**(a - 1))/2 + (b * x**(b - 1))/2.

    Then, a simple algorithm to sample from this distribution is:

    • Generate u, a uniform random variate in [0, 1].
    • With probability 1/2, return u**(1/a). Otherwise, return u**(1/b).

    Code follows:

    import scipy.stats as st
    import random
    
    def mixpowerlaw(a,b,size):
       unif=st.uniform.rvs(size=size)
       pla=st.powerlaw(a)
       plb=st.powerlaw(b)
       ret=[0 for i in range(size)]
       for i in range(size):
          if unif[i]<0.5:
             ret[i]=pla.rvs()
          else:
             ret[i]=plb.rvs()
       return ret
    
    # Alternate implementation
    def mixpowerlaw2(a,b):
       if random.random()<0.5:
          return random.random()**(1/a)
       else:
          return random.random()**(1/b)
    
    

    * This is the nature of the mixture distribution's PDF. Since you're adding two PDFs in your question's example (which both integrate to 1), the combined function will integrate to 2, so that the function is divided by 2 to get a PDF that integrates to 1 again. This combined PDF is a mixture of the form (PDF1)*(1/2) + (PDF2)*(1/2), so that sampling from this PDF in this case means to sample from PDF1 with probability 1/2 or PDF2 with probability 1/2. See also p. 66 of "Non-Uniform Random Variate Generation" by Devroye.