Search code examples
randomjuliasamplesampling

Julia: best way to sample from successively shrinking range?


I would like to sample k numbers where the first number is sampled from 1:n and the second from 1:n-1 and the third from 1:n-2 and so on.

I have the below implementation

function shrinksample(n,k)
    [rand(1:m) for m in n:-1:n-k+1]
end

Are there faster solutions in Julia?


Solution

  • The following takes ideas from the implementation of randperm and since n and k are of the same order, this is appropriate as the same type of randomness is needed (both have output space of size n factorial):

    function fastshrinksample(r::AbstractRNG,n,k)
        a = Vector{typeof(n)}(k)
        @assert n <= Int64(2)^52
        k == 0 && return a
        mask = (1<<(64-leading_zeros(n)))-1
        nextmask = mask>>1
        nn = n
        for i=1:k
            a[i] = 1+Base.Random.rand_lt(r, nn, mask)
            nn -= 1
            if nn == nextmask
                mask, nextmask = nextmask, nextmask>>1
            end
        end
        return a
    end
    
    fastshrinksample(n,k) = fastshrinksample(Base.Random.GLOBAL_RNG, n, k)
    

    Benchmarking gives a 3x improvement for one tested instance:

    julia> using BenchmarkTools
    
    julia> @btime shrinksample(10000,10000);
      310.277 μs (2 allocations: 78.20 KiB)
    
    julia> @btime fastshrinksample(10000,10000);
      91.815 μs (2 allocations: 78.20 KiB)
    

    The trick is mainly to use the internal Base.Random.rand_lt instead of regular rand(1:n)